By Cindy Royal and Dale Blasingame
The usage and application of data are reflected in recent, high profile topics in the news. The phrase “data journalism” began being used in place of “computer-assisted reporting” in the mid-2000s. But a precise and comprehensive conceptual definition has yet to be accepted. The phrase itself has a broad range of meanings that often cause confusion and difficulty of categorization. Using grounded theory to identify assertions of the phrases “data journalism” or “data-driven journalism” in both academic literature and media, this paper explicates a conceptual definition of data journalism and identifies several dimensions under which the definition may be operationalized.
The usage and application of data are reflected in recent, high profile topics in the news. With investigations involving leaks of government information to the use of the phrase “big data” as it relates to social media, healthcare, business and science, understanding the role of data in our lives is more relevant than ever. The phrase “data journalism” began being used in place of the more traditional “computer-assisted reporting” in the mid-2000s and is often seen as interchangeable with the phrase “computational journalism.” But, to date, few academic studies have focused on this phenomenon, and a precise and comprehensive conceptual definition has yet to be accepted. The phrase itself has a broad range of meanings that often cause confusion and difficulty of categorization. This paper explicates the definition of “data journalism” in hopes of providing clarity for future tracks of study and ways to operationalize it in research.
The use of data is an important part of most modern websites. It’s what makes blogs, content management systems and social media possible, and it is evident in how media organizations use analytics for making decisions. Data can be used as a source for a story, used in infographics as part of a story or it can be the story, a data-driven interactive that allows the user to engage and customize the meaning based on variables and inputs. In 2014, an entire article was written with data, breaking the story of a Los Angeles earthquake by filling in specific data points in a predefined format (Neal, 2014; Schwencke, 2014).
News organizations, like Nate Silver’s FiveThirtyEight site, have employed data analyses as their niche, and with the rise of journalistic platforms, sites like Medium,
Vox, Buzzfeed and Upworthy rely on data to manage content that comes from a range of sources. Data is the basis for the platforms that drive a media company’s online businesses (Royal, 2014). But it can also be a controversial topic as it relates to measuring a journalist’s output (Kirkland, 2014), and critiques of data journalism are being voiced as a caution against bias (Schrager, 2014).
There are numerous examples of organizations considered practitioners of data journalism. The New York Times was a pioneer in developing news applications that rely on data, with such projects as “Is It Better to Rent or Buy?”, “New York State Test Scores”, “Toxic Waters,” and projects around The Olympics and Academy Awards coverage.
The interactive dialect quiz “How Ya’ll, Youse and You Guys Talk” was the most visited story on nytimes.com in 2013 (and it was developed by an intern!). Interactive quizzes like this or those developed by BuzzFeed indicate a trend toward stories that incorporate user participation.
The Texas Tribune hosts a range of data applications including its Government Salaries Explorer and Public Schools Explorer. Other organizations regularly practicing data journalism techniques include the Los Angeles Times, WNYC, NPR, the Chicago Tribune, ProPublica and The Guardian with many other news organizations attempting to incorporate data presentations into their workflows.
With this range of potential uses for data, it seems like an appropriate time to understand what exactly is meant by “data journalism.” This paper contributes to the research on data journalism to advance scholarly understanding of the concept by highlighting a range of assertions and identifying areas of categorization, specific dimensions, that will be useful in operationalizing the term in future studies.
The concept of “data journalism” may seem new, but it has roots in the well-established fields of graphic design and computer-assisted and investigative reporting. Edward Tufte, in his book The Visual Display of Quantitative Information (2001, p.13), defined a clear mission for statistical graphic design: “Excellence in statistical graphics consists of complex ideas communicated with clarity, precision and efficiency.” This definition places the presentation of data firmly within the mission of journalism.
Philip Meyer, Professor Emeritus at the University of North Carolina, wrote the definitive books on computer-assisted reporting. He pioneered using social science methods in journalism in his Pulitzer-prize winning coverage of the 1967 Detroit race riots. Meyer said “knowing what to do with data is the essence of the new precision journalism” (Meyer, 1991).
What makes modern “data journalism” approaches new are the ability to present information online with an interactive component that allows for user customization and the use of databases to populate graphics with dynamic information. Adrian Holovaty, formerly of WashingtonPost.com, founder of the now defunct website EveryBlock.com and considered an early adopter of data journalism, used the phrase “programmer as journalist,” defining an emerging technical role in news development for people with the skills to launch visualizations to the Web (Niles, 2006).
The use of data in journalism has been questioned and is often positioned in opposition to traditional reporting methods. On his website, Holovaty gave what he considered a “definitive, two-part answer “ to the question “is data journalism?” In a most direct way, he dealt with those who questioned data’s storytelling ability.
It’s a hot topic among journalists right now: Is data journalism? Is it journalism to publish a raw database? Here, at last, is the definitive, two-part answer:
1. Who cares?
2. I hope my competitors waste their time arguing about this as long as possible. (Holovaty, 2009)
The comments on that post ranged from snarky to serious, but Holovaty’s contempt for those who questioned data’s place in journalism was evident.
Data journalism has become an emerging topic in academic research. Royal (2010; 2012) published a case study of The New York Times Interactive News team, spending a week interviewing team members and studying their processes. She identified the unique skill sets, workflows and culture of an organization that integrated programming with storytelling. Royal (2013) also analyzed the diffusion of Olympic interactives of The New York Times over time. Parasie and Dagiral (2013) studied data-driven journalism at the Chicago Tribune and found that programmer-journalist practitioners have provided new ways for journalism to address social good.
The role of those practicing modern data journalism has been evolving since the mid-2000s. The New York Times Interactive News Technology department was featured in a New York Magazine article (Nussbaum, 2009). The function was described as one that “elevated coders into full-fledged members of the Times—deputized to collaborate with reporters and editors, not merely to serve their needs.”
Lewis and Usher (2013) more generally discussed the role of technology in newsrooms, specifically the introduction of open source culture and the professional organization Hacks and Hackers.
One of the first books written on “data journalism” was Facts Are Sacred (2013) by Simon Rogers, chronicling his responsibilities when he was data editor for The Guardian. Rogers is now data editor at Twitter.
Rogers (2013, p. 19) said this of data journalism:
“Data journalism” or “computer-assisted reporting”? What is it? How do you describe it? Is it even real journalism? These are just two terms for the latest trend, a field combining spreadsheets, graphics, data analysis and the biggest news stories to dominate reporting in the last two years.
Rogers also pointed to a “new, widespread transparency movement” and identified the four factors below as instrumental to data journalism’s evolution.
• the widespread availability of data via the Internet;
• easy-to-use spreadsheet packages on every home computer;
• a growing interest in visualizing data, to make it easier to understand;
• some huge news stories that would not have existed without the statistics behind them. (Rogers, 2013, p. 20).
In Data Journalism: Mapping the Future, Beleaga (Mair & Keeble, 2013, p. 27) gave this definition of data journalism: “Reduced to its most basic feature, data journalism, or data-driven journalism, as it is also referred to within the industry and the academy, is the process of telling stories with data.”
Joannes (Mair & Keeble, 2013, p. 28) added to our understanding by connecting data journalism to its role in a democracy. “Data journalism is a form of rich media with an added dimension: it implies a return to the factual, to the investigative. It’s about interrogating the data, finding and formatting the relationships. Data journalism is a tool of democracy.”
Other books have introduced data concepts as relevant to journalism, including Visualize This: The FlowingData Guide to Design, Visualization and Statistics (Yau, 2011) and Data Points: Visualization That Means Something (Yau, 2013).
To move beyond a basic understanding of data journalism, a more nuanced and thorough analysis of the phrase is in order. According to Fink and Anderson (2014, p. 2), “Data journalism is ultimately a deeply contested and simultaneously diffuse term, and thus would seem to impose analytical difficulties for those who wish to study it.”
Coddington provided a typology of features of differentiation (professional orientation, openness, epistemology and vision of public) between the often-related fields of computer-assisted reporting, data journalism and computational journalism. “For researchers, however, these definitional questions are fundamental to analyzing these practices as sites of professional and cultural meaning, without which it is difficult for a coherent body of scholarship to be built” (Coddington, 2014, p. 2).
De Maeyer, et al. (2014, p. 8), in studying data journalism in Belgian media, also found difficulty in defining the field. “The very notion of data journalism shows a remarkably wide range of meanings among our respondents: despite the fact that there are themes that connect the diversity of discourses, there is no consensus on core issues regarding the definition of the phenomenon.”
The area of data journalism saw a spate of articles in late 2014 that addressed its importance in scholarship. Data journalism was identified as a significant development and emerging area (Franklin, 2014; Lewis, 2014) and aligned with the history of quantification in journalism practice (Anderson, 2014). Fink and Anderson (2014, p. 1) carried out interviews with data journalists to define the field, using a grounded theory approach. “Understanding the phenomenon of data journalism requires an examination of this emerging practice not just within organizations themselves, but across them, at the inter-institutional level.” They describe a field in development, thus in need of clarification and understanding.
The Tow Center for Digital Journalism at the Columbia Journalism School released the report “The Art and Science of Data-Driven Journalism” in May 2014 (Howard, 2014). A special issue of the academic journal Digital Journalism is planned for publication in 2015 on “Journalism in the Era of Big Data.” In November 2014, the articles associated with the special issue were made available online. These projects represent a significant expansion of scholarly understanding dealing with the role of data in journalism, and many of the assertions offered by these articles are represented in this analysis.
The method for this analysis is a qualitative study combining concept explication and grounded theory, complemented by quantitative word frequency analysis and video interviews. Grounded theory employs a systematic, qualitative exploratory approach (Glaser, 1967). Chaffee provided guidelines for performing concept explication (1991), and studies of this nature have been applied in journalism scholarship when issues are emerging or contested. Examples include research studies that explicate interactivity (Kiousis, 2002) and journalism-as-a-conversation (Marchionni, 2013).
This analysis used a modified version of Chaffee’s approach by identifying relevant literature and locating potential assertions of “data journalism” as used in communication and other academic sources and general press. We then identified several relevant dimensions and developed a comprehensive conceptual definition. The goal was to identify as many definitions or descriptions of the phrase as possible within sources relevant to journalism and mass communication.
RQ1: What are the sources publishing assertions of “data journalism”?
RQ2: What themes emerged in the assertions of “data journalism”?
RQ3: What are the dimensions of the phrase “data journalism” evident in assertions provided across journalism resources?
RQ4: What is the comprehensive, conceptual definition of “data journalism”?
The search for assertions of “data journalism” began with academic research about communication, using the Communication Source research database. A search of the database for “data journalism” or “data-driven journalism” yielded 12 peer-reviewed articles. When not controlling for “peer-reviewed” publications, 28 items were present in the database. Of the 28, 15 were used in the analysis. Non-English language articles and those that did not contain a useful description of the phrase “data journalism” were eliminated.
The phrase “data journalism” is much more widely used on the open Internet on websites that report on current trends in journalism and may be too new to have had many academic research papers written discussing it. Specific sources in which the researchers were aware that covered new issues related to journalism were tapped, including PBS Media Shift, the Poynter Institute and other academic sources related to journalism, including Nieman Journalism Lab, #ISOJ Journal and Digital Journalism. Finally, a general Google search for the term was performed to identify any other relevant sources that used the phrase. While this approach does not represent a generalizable sample, it was chosen to generate as broad a range of descriptions of “data journalism” as possible.
In addition to the textual analysis performed in this study, the authors interviewed data journalism professionals and educators at the Online News Association conference in Chicago in September 2014. We asked three questions:
1. What is data journalism?
2. What are the skills necessary to perform data journalism?
3. What trends do you see for the future of data journalism?
The videos of their responses are organized by section below. These responses correspond to the coding scheme of this study and provide additional validation for the direction and results. A consolidated video is provided at the end of the paper.
Identification of Sources
In answering Research Question #1, a total of 63 assertions defining or describing “data journalism” were identified across 23 sources. Each source is identified as having originated in the Communication Source research database, a professional online publication or other academic source not identified in the research database. Other academic sources were identified as those being associated with a university or an academic journal known for its work in the area that is not yet indexed by the Communication Source database. It is important to note that most of the early discussion of “data journalism” occurred in the professional online or other academic sources.
Sources of Data Journalism Definitions
• #ISOJ Journal – Other Academic
• British Journalism Review – Communication Source
• Columbia Journalism Review – Other Academic
• Data Journalism Handbook – Professional Online
• DataDrivenJournalism.net – Professional Online.
• Digital Journalism – Other Academic
• Editor and Publisher – Communication Source
• Global Media Journal: Australian Edition– Communication Source
• Global Media Journal: Canadian Edition – Communication Source
• Guardian and Guardian Datablog – Professional Online
• Index on Censorship – Communication Source
• Intermedia – Communication Source
• Journalism – Communication Source
• Journalism and Mass Communication Quarterly – Communication Source
• Journalism Studies – Communication Source
• Mashable – Professional Online
• Media Magazine – Communication Source
• New Media and Society – Communication Source
• Nieman Lab – Other Academic (Harvard)
• PBS MediaShift – Professional Online
• Poynter – Professional Online
• Quill – Communication Source
• Tow Center Blog/Reports at Columbia University – Other Academic
The years in the sample were referenced as follows, indicating a strong increase in interest and discourse about “data journalism” over time (see Figure 1).
To answer Research Question #2, the full text of all the assertions was analyzed for frequency of terms via a Python script developed by the authors for counting word frequency. The following were the most commonly used themes across the sample, controlling for “data” and “journalism” across the assertions (see Figure 2).
The prevalence of these terms indicates that statements dealing with “data journalism” are grounded in many of the concepts of traditional journalism—storytelling, news, information, sources and reporting, but words like statistics, visualizing, interactive, tools, database and programming add the new or contemporary element.
In addressing Research Question #3, the assertions were coded and categorized using the grounded theory approach, with the goal of identifying the various dimensions of the phrase. The following dimensions emerged within the sample (see Figure 3). Some definitions exhibited characteristics of multiple dimensions and were categorized as hybrid statements.
Note: Complete coded definitions can be found in the Google Spreadsheet at https://docs.google.com/spreadsheets/d/1PXKt_kTpwC7Wy8lF0_uFr471p_zFAYsBIe2pPMcVCuQ/edit?usp=sharing.
The results are discussed below.
These categories are consistent with and add to areas identified by Coddington (2014, p. 5):
Professional definitions have tended to be broad, characterizing data journalism as essentially any activity that deals with data in conjunction with journalistic reporting and editing or toward journalistic ends…. Several others have defined data journalism in terms of its convergence between several disparate fields and practices, characterizing it as a hybrid form that encompasses statistical analysis, computer science, visualization and web design, and reporting… Data journalism has been closely associated with the use and proliferation of open data and open-source tools to analyze and display that data…, though open data is not necessarily or exclusively a part of its domain of practice.
Dimensions of Data Journalism
At the most basic level, a general definition was provided by Paul Bradshaw in the Data Journalism Handbook (Gray, J., Bounegru, L., & Chambers, L., 2012). “What is data journalism? I could answer, simply, that it is journalism done with data. But that doesn’t help much.” He continues to describe the complicated nature of describing data journalism.
Both ‘data’ and ‘journalism’ are troublesome terms. Some people think of ‘data’ as any collection of numbers, most likely gathered on a spreadsheet. Twenty years ago, that was pretty much the only sort of data that journalists dealt with. But we live in a digital world now, a world in which almost anything can be—and almost everything is—described with numbers.
The introduction of the book, entitled “What is Data Journalism?” continued to include a range of potential definitions that are evident across several of the identified dimensions of this analysis. As the comprehensive textbook on “data journalism,” it was most prolific in providing early descriptions of the phrase.
Introduction: A video with introductory comments is provided at https://www.youtube.com/watch?v=k2aBj56P6tc
The most prevalent theme was the one that dealt with the process or general function of “data journalism.” Assertions including the term “process” or describing actions related to process as the primary function with words including “aggregating,” “filtering,” “organizing” or “visualizing” were coded in the Process dimension. Examples of the Process dimension include:
[Data journalism is] “a reporting process that uses spreadsheet programs to generate statistics from public records and data sets” (Hackett, 2013, p. 35).
[Data journalism is] “the aggregating, filtering, and visualizing of large sets of data, based on statistical methods of data analysis” (Dreyfus, S., Lederman, R., & Bosua, R., 2011, p.4).
“I would say data journalism is such a wide range now of styles—from visualisation to long form articles. The key thing they have in common is that they’re based on numbers and statistics—and that they should aim to get a ‘story’ from that data. The ultimate display of that story, be it words or graphics, is irrelevant, I think—it’s more about the process” (Rogers, 2012).
“Data journalism is the practice of finding stories in numbers and using numbers to tell stories” (Howard, Art and Science, 2014).
“Doing data journalism implies to ‘process data’, to access it, to correlate it, and finally to present it, but also to do a form of data-seeking journalism, or even a way to use databases” (DeMaeyer et al, 2014, p. 8).
Process: The video demonstrating Process definitions is available at https://www.youtube.com/watch?v=BkFPuyfjAdE
Other sources focused on the news products or outcomes of a “data journalism” activity, referencing “graphics,” “infographics,” “customization,” “charts,” “maps” and “Web apps.” Examples of the Product dimension include:
“Data journalism can help a journalist tell a complex story through engaging infographics” (Gray, J., Bounegru, L., & Chambers, L., 2012).
[Data journalism is] “news products that engage the user and that often use a database to populate the information” (Royal, 2012, p. 10).
“Any method of storytelling that engages the user with customization and interactivity and presents data in a visual manner through charts, maps and simulations” (Royal, 2013, p. 112).
“Data journalism—interactives, infographics, charts and tables—were tapped to convey factual aspects like historical timelines and status of gun control policy” (Xie, 2013).
“Some stories are just better told as databases and interactive web apps” (Betancourt, 2009).
Product: The video demonstrating Product definitions is available at https://www.youtube.com/watch?v=Qi-iXJV9iow
Convergence of Fields
Assertions that described “data journalism” by referencing a list of different academic or professional fields were coded in the Convergence of Fields dimension. These areas included social science, statistics, data analysis, data science and computer science. The statements in this dimension emphasized the intersection of journalism with other fields. Examples of the Convergence of Fields dimension include:
“Data journalism is ‘incomprehensibly enormous,’ in part because it represents the convergence of several fields—programming, design, statistics and investigative research, to name a few” (Bradshaw, 2010).
“In the hands of the most advanced practitioners, data journalism is a powerful tool that integrates computer science, statistics, and decades of learning from the social sciences in making sense of huge databases” (Howard, Art and Sciences, 2014).
“Data journalism is currently an emerging form of storytelling, where traditional journalistic working methods are mixed with data analysis, programming and visualization techniques” (Appelgren & Nygren, 2014, p. 394).
“Data journalism is a journalism specialty reflecting the increased role that numerical data is used in the production and distribution of information in the digital era. It reflects the increased interaction between content producers (journalist) and several other fields such as design, computer science and statistics. From the point of view of journalists, it represents ‘an overlapping set of competencies drawn from disparate fields’” (Thibodeaux, 2011).
Convergence of Fields: The video demonstrating definitions dealing with Convergence of Fields is available at https://www.youtube.com/watch?v=x8Ozu3sYvU0
The Traditional category included assertions that explained that “data journalism” was not a new field and should be considered in relation to the legacy of data analysis in journalism. It also included descriptions defining “data journalism” in comparison to what it is not or how it differed from traditional journalism. Examples of the Traditional dimension include:
“Data journalism and its practice are not new, along with existing critiques of its practices or of programming in journalism generally “(Howard, Debugging the Backlash, 2014).
“One of the editors points out that analyzing data is not in itself something new for journalists, however, the new tools that are currently available speed up the process of working with large data sets” (Appelgren & Nygren, 2014, p. 403).
“What makes data journalism different to the rest of journalism? Perhaps it is the new possibilities that open up when you combine the traditional ‘nose for news’ and ability to tell a compelling story, with the sheer scale and range of digital information now available” (Gray, Bounegru, & Chambers, 2012).
“Although journalists have been using data in their stories for as long as they have been engaged in reporting, data journalism is more than traditional journalism with more data“(Howard, Art and Science, May 2014).
“What’s new isn’t so much data journalism, but rather the method that allows us to cross-tabulate data on a large scale” (DeMaeyer et al, 2014, p. 9).
Traditional: The video demonstrating Traditional definitions is available at https://www.youtube.com/watch?v=hk_zeWTSCAk
The descriptions in the Outside Influence category defined “data journalism” as it relates to its effects on individuals and culture. These assertions framed the description in terms of the benefits associated with users and society. Assertions that discussed “Freedom of Information” and the role of journalists in a democracy were included in the Outside Influence dimension. Examples of the Outside Influence dimension include:
“It can help explain how a story relates to an individual” (Gray, Bounegru, & Chambers, 2012).
“Data-driven journalism” improves the way journalism can contribute to democracy—especially at a time when a growing number of data sets are released by governments” (Parasie & Dagiral, 2013, p. 855).
“After all, programming and data are journalism. And it can be practiced in such a way that it can create interaction, user engagement, and more information in terms of seeking the truth. Especially when you talk about Freedom of Information access to government data—if the public can have access to that in a way that makes sense to them, or in a way that’s easy for them to use, then that’s just really powerful” (Garber, 2010).
“Reporting that seeks to quantify events and make real, numerical sense of human suffering and significant events” (Arana, 2012, p. 178).
“Building capacity in data journalism is directly connected to the role the Fourth Estate plays in democracies around the world. There are important stories buried in that explosion of data from government, industry, media, universities, sensors, and devices that aren’t being told because the perspective and skills required to do it properly aren’t widespread in the journalism industry” (Howard, Art and Science, 2014).
Outside Influence: The video demonstrating definitions dealing with Outside Influence is available at https://www.youtube.com/watch?v=VTkuOP18w_Y
A set of assertions emphasized the skills, roles or technologies one needs to perform data journalism, often framed in an educational context. Examples of the Skills dimension include:
“I think schools are making a better effort to train young journalists in many of the skills that fall under the umbrella of data journalism: data wrangling, analysis, visualization; statistics; digital literacy (how does the Web work?); Web development” (Howard, Profile of the Data Journalist, 2014).
“Data journalism is a new set of skills for searching, understanding and visualizing digital sources in a time that basic skills from traditional journalism just aren’t enough. It’s not a replacement of traditional journalism, but an addition to it” (Gray, Bounegru, & Chambers, 2012).
“Emergence of a new generation of web-based technologies that have made the presentation and visualization of data-driven stories easy even for those with no database or web development experience” (Vallance-Jones, 2013, p. 19).
Skills: The video demonstrating definitions dealing with Skills is available at https://www.youtube.com/watch?v=DS4Jj9VUUZo
Finally, a series of assertions employed more than one dimension to describe the breadth of “data journalism,” often invoking process along with product, skills or outside influence. Examples of the Hybrid dimension include:
“Data journalism is bridging the gap between stat technicians and wordsmiths. Locating outliers and identifying trends that are not just statistically significant, but relevant to de-compiling the inherently complex world of today” (Gray, Bounegru, & Chambers, 2012). This assertion exhibited the Process, Skills and Outside Influences dimensions.
“Sourcing, reporting and presenting stories through data-driven journalism, and visualising and presenting data (including databases, mapping and other interactive graphics)” (Arthur, 2010). This assertion exhibited Process and Product dimensions.
“Data journalism is an umbrella term that, to my mind, encompasses an ever-growing set of tools, techniques and approaches to storytelling. It can include everything from traditional computer-assisted reporting (using data as a ‘source’) to the most cutting edge data visualization and news applications. The unifying goal is a journalistic one: providing information and analysis to help inform us all about important issues of the day” (Gray, Bounegru, & Chambers, 2012). This assertion exhibited the Process, Product and Outside Influences dimensions.
“Broadly speaking, ‘data journalism’ is a fairly recent term that is used to describe a set of practices that use data to improve the news. These range from using databases and analytical tools to write better stories and do better investigations, to publishing relevant datasets alongside stories, and using datasets to deliver interactive data visualizations or news apps” (Gray, 2012). This assertion exhibited Process and Product dimensions.
“It argues that journalism, and hence data journalism, can be understood as a socio-discursive practice: it is not only the production of (data-driven) journalistic artefacts that shapes the notion of (data) journalism, but also the discursive efforts of all the actors involved, in and out of the newsrooms” (DeMaeyer et al., 2014, p. 3). This assertion exhibited Process and Product dimensions.
Hybrid: The video demonstrating Hybrid definitions is available at https://www.youtube.com/watch?v=ce570olLO4Y
Conclusion & Future
The phrase “data journalism” inspired a range of descriptions across various sources within the dimensions of Process, Product, Convergence of Fields, Traditional, Outside Influence and Skills. Some definitions sought to capture the breadth of “data journalism” with comprehensive hybrid descriptions. The most common assertions focused on the Process dimension, with a relatively even representation of assertions across most other dimensions.
Based on this analysis, the following comprehensive, conceptual definition is offered in answer to Research Question #4:
Data journalism is a process by which analysis and presentation of data are employed to better inform and engage the public. Its roots are in the fields of computer-assisted and investigative reporting, but data journalism products may add engagement through customization and user contribution made possible by Web development and programming techniques.
However, a definition that includes too many dimensions may not be useful for a specific purpose. The focus of the definition may need to vary in order to operationalize a research question based on the purpose or needs of the specific analysis. This analysis provides the dimensions under which operational definitions can be crafted or scholarship can be focused.
This is an important area of study due of its relative newness and rapidly changing nature. Before one can study a field, one must comprehend the range of dimensions to better focus on a particular phenomenon. Future research can emphasize specific dimensions or analyze interactions between dimensions.
This area is also important because there has been criticism of the lack of interest or aptitude amongst the journalism profession in working with numbers and data. Adrian Holovaty (2006) said, “I’ve only met a handful of people who became journalists because they like information. And I think that helps explain why there have been some major cultural issues in the journalism world in the age of the Internet”
Or as Aron Pilhofer, formerly of The New York Times and currently Executive Editor of Digital at The Guardian, said:
Journalism is one of the few professions that not only tolerates general innumeracy, but celebrates it. I still hear journalists who are proud of it, even celebrating that they can’t do math, even though programming is about logic. It’s hard to get a journalist to open up a spreadsheet, much less open up a command line. It is just not something that they, in general, think this is held to be an important skill… It’s a cultural problem. (Howard, Aron Pilhofer and Data Journalism, 2014).
Others remain skeptical as to the usefulness of data journalism:
Data journalism is nice, but it’s not life. Yet, by doing our job as journalists, we must tell life as it happens. And it’s not enough to stay behind one’s desk with a computer, one must go out into the field. Check if the data that you have is for real. You will not tell people, on television for example, that life expectancy is 70 without going out to see old people. (DeMaeyer et al, 2014, p. 11).
The issues that separate journalism from the technology culture that develops the tools and platforms used to distribute news and information, as well as the general comprehension of the role of data in storytelling, will need to be better understood in order to assist the profession in dealing with these intersections.
Data journalism is considered in its infancy but has gained a strong following of practitioners. Several events around data journalism have emerged with the School of Data Journalism conference in Perugia, Italy each spring as well as the National Institute of Computer-Assisted Reporting (NICAR) conference that has become the main event for producers of data journalism. Additionally, academic resources have been developed to teach data journalism, including Massively Open Online Courses provided by the European Journalism Centre and the Knight Center for Journalism in the Americas at The University of Texas at Austin. Academic journalism programs have begun introducing data and visualization techniques in their curriculum and new programs exist to combine computer science and journalism.
“Data journalism” is expected to continue to be the focus of news stories and scholarship. And a field of “computational journalism” is emerging that provides many of the same aspects of “data journalism,” although its use has been mostly academic (Cohen, Hamilton & Turner, 2011; Anderson, 2013; Flew, Spurgeon & Swift, 2012). The role of “big data” is expected to continue to be an important area for journalism practitioners and researchers. “Big data invokes a wide range of normative claims and practical implications for journalism as a professional practice and an organizational production—from knowledge work and economic rationale to practical skills and philosophical ethics” (Lewis & Westlund, 2014, p.3).
Limitations of this study include the method to which assertions were generated. The purpose of the selected method was to generate as many assertions of the phrase “data journalism” across relevant sources in the related academic and professional arenas. But this method did not generate an exhaustive list. Other methods may generate new or different sources. The field is moving quickly and new research is being generated at a rapid pace that make it difficult to pinpoint a definitive understanding of the phrase, while it is still evolving.
We will continue to see new methods of reporting that employ data and programming techniques, thus this analysis is a first step in better understanding the field so that we can productively employ it in scholarly research and other types of academic writing. But we need a starting point. What this analysis does is provide a more nuanced and systematic understanding of data journalism as a research area within the realm of journalism scholarship.
Full Video of all the segments above is available at https://www.youtube.com/watch?v=p8rMvHKXWKc.
Anderson, C. W. (2014). Between the unique and the pattern: Historical tensions in our understanding of quantitative journalism. Digital Journalism. Published online Nov. 20, 2014.
Anderson, C. W. (2013). Towards a sociology of computational and algorithmic journalism. New Media & Society, 15(7), 1005-1021.
Bradshaw, P. (2010). How to be a data journalist. Guardian Data Blog. October 1, 2010. http://www.theguardian.com/news/datablog/2010/oct/01/data-journalism-how-to-guide.
Chaffee, S. H. (1991). Explication (Vol. 1). SAGE Publications, Incorporated.
Coddington, M. (2014). Clarifying journalism’s quantitative turn. Digital Journalism. Published online Nov. 7, 2014. http://www.tandfonline.com/doi/abs/10.1080/21670811.2014.976400?src=recsys#.VSP7aJPF87Q
Cohen, S., Hamilton, J. T., & Turner, F. (2011). Computational journalism. Communications of the ACM, 54(10), 66-71.
De Maeyer, J., Libert, M., Domingo, D., Heinderyckx, F., & Le Cam, F. (2014) Waiting for data journalism. Digital Journalism. Published online November 27, 2014. http://www.tandfonline.com/doi/abs/10.1080/21670811.2014.976415#.VSP7opPF87Q
Fink, C., & Anderson, C. (2014). Data journalism in the United States. Journalism Studies. Published online Aug. 8, 2014. http://dx.doi.org/10.1080/1461670X.2014.939852.
Flew, T., Spurgeon, C., Daniel, A., & Swift, A. (2012). The promise of computational journalism. Journalism Practice, 6(2), 157-171.
Franklin, B. (2014). The future of journalism. Digital Journalism 2(3). Pages?
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research. Piscataway. NJ: Transaction.
Gray, J., Chambers, L., and Bounegru, L. (2012). The data journalism handbook. O’Reilly Media, Inc.
Holovaty, A. (2006). A fundamental way newspaper sites need to change. September 6, 2006. http://www.holovaty.com/writing/fundamental-change
Holovaty, A. (2006). The definitive, two-part answer to Is data journalism?” May 21, 2009. http://www.holovaty.com/writing/data-is-journalism/
Howard, A. (2014). Aron Pilhofer on data journalism, culture and going digital. Tow Center for Digital Journalism Blog. March 27, 2014. http://towcenter.org/blog/aron-pilhofer-data-journalism-culturegoing- digital
Howard, A. (2014). The art and science of data-driven journalism. Tow Center for Digital Journalism Blog. May 30, 2014. http://towcenter.org/blog/the-art-and-science-of-data-driven-journalism/
Kiousis, S. (2002). Interactivity: A concept explication. New Media & Society 4(3), 3.
Kirkland, S. (2014). As reporters get measured, why even BuzzFeed, Upworthy Aren’t beholden to numbers, Poynter.org. March 25, 2014. http://www.poynter.org/latest-news/mediawire/244541/asreporters-get-measured-why-even-buzzfeed-upworthy-arent-beholden-to-numbers/
Lewis, S. (2014). Journalism in the era of big data. Digital Journalism. Published online Nov. 27, 2014.
Lewis, S., & Usher, N. (2013). Open source and journalism: Toward new frameworks for imagining news innovation. Media, Culture & Society, 35(5), 602-619.
Lewis, S. & Westlund, O. (2014). Big data and journalism. Digital Journalism. Published online Nov. 27, 2014. http://www.tandfonline.com/doi/abs/10.1080/21670811.2014.976418?src=recsys#. VSP8DJPF87Q
Mair, J. & Keeble, R. L. (2014). Data journalism: Mapping the future. Suffolk, UK: Abramis.
Marchionni, D. M. (2013). Journalism as a Conversation: A concept explication. Communication Theory, 23(2), 131-147.
Meyer, P. (1991). The new precision journalism. Bloomington, IN:Indiana University Press.
Neal, R. (2014). Robo-journalism: LA Times bot writes and publishes earthquake article in 3 minutes. International Busiiness Times. March 19, 2014. http://www.ibtimes.com/robo-journalism-latimes-bot-writes-publishes-earthquake-article-3-minutes-1562397
Niles, R. (2006). The programmer as journalist: A Q&A with Adrian Holovaty. Online Journalism Review. June 5, 2006. http://www.ojr.org/the-programmer-as-journalist-a-qa-with-adrian-holovaty/
Nussbaum, E. (2009, January 11). Goosing the Gray Lady. New York Magazine.
Parasie, S., & Dagiral, E. (2013). Data-driven journalism and the public good: ‘Computer-assistedreporters’ and ‘programmer-journalists’ in Chicago. New Media & Society, 15(6), 853-871.
Rogers, S. (2013). Facts are Sacred: The power of data. London, UK: Faber & Faber.
Royal, C. (2013). Interactives of Olympic proportions: Diffusion of innovation at The New York Times. ISOJ Journal, Spring, Vol. 3(2).
Royal, C. (2010). The journalist as programmer: A case study of The New York Times interactive news technology department. Paper presented at the International Symposium On Online Journalism. Austin, Texas. April 2010. https://online.journalism.utexas.edu/2010/papers/Royal10.pdf
Royal, C. (2012). The Journalist as Programmer: A Case Study of The New York Times Interactive News Technology Department, ISOJ Journal, Spring, Vol 2(1)
Royal, C. (2014). Are journalism schools teaching their students the right skills. Nieman Journalism Lab. http://www.niemanlab.org/2014/04/cindy-royal-are-journalism-schools-teaching-their-studentsthe-right-skills/
Schrager, A. (2014, March, 19). The problem with data journalism. Quartz. http://qz.com/189703/the-problem-with-data-journalism
Schwencke, K. (2014, March 17). Earthquake aftershock: 2.7 quake strikes near Westwood. http://www.latimes.com/local/lanow/earthquake-27-quake-strikes-near-westwood-californiardivor,0,3229825.story-axzz2wQwc82EK
Tufte, E. R., & Graves-Morris, P. R. (1983). The visual display of quantitative information (Vol. 2). Cheshire, CT: Graphics Press.
Yau, N. (2011). Visualizing data: The flowing data guide to design, visualization and statistics. Hoboken, NJ: Wiley.
Yau, N. (2013). Data points: Visualizing data that means something. Hoboken, NJ: Wiley.
Cindy Royal is an associate professor in the School of Journalism and Mass Communication at Texas State University teaching digital and data-driven media skills and concepts. She completed Ph.D. studies in Journalism and Mass Communication at The University of Texas at Austin in May 2005. Prior to doctoral studies, she had a career in marketing at Compaq Computer (now part of Hewlett Packard) in Houston and NCR Corporation in Dayton, Ohio. She has a Master of Business Administration from the University of Richmond and a Bachelor of Science in Business Administration from the University of North Carolina at Chapel Hill. In 2013, Royal received the Presidential Award for Excellence in Teaching at Texas State University and the AEJMC/Scripps Howard Journalism and Mass Communication Teacher of the Year award. During the 2013-2014 academic year, she was in residence at Stanford University in the Knight Journalism Fellowship program. She hosts a music blog at onthatnote.com and a tech blog at tech.cindyroyal.net. Additional detail regarding her research, education and experience can be found at cindyroyal.com.
Dale Blasingame is a lecturer in the School of Journalism and Mass Communication at Texas State University. He received a master’s degree in New Media from Texas State in 2011 and an undergraduate degree in journalism from Texas State in 1999. Dale teaches media writing and digital skills courses. Prior to teaching, Dale spent nine years as a producer at News 4 WOAI, where he won two Emmy awards for Best Newscast in Texas. He was a newsroom leader in recognizing the advantages of social media for television stations. Dale is also the social media manager for Leadhub, an Internet marketing company in San Antonio, Texas. He manages social and video production for clients across Texas and California. In his off time, Dale likes to travel to state and national parks, hike and (sometimes) enjoys keeping track of the Dallas Cowboys.