Scalable Social Data Analysis
Abstract
Analysts in all areas of human knowledge, from science and engineering to economics, social science and journalism are drowning in data. The proliferation of digital information requires tools and techniques for exploring, analyzing and communicating data in a manner that scales as both the data and the organizations analyzing it grow in size. Throughout the data life-cycle, sensemaking is often a collaborative process. As different analysts each contribute to data acquisition, cleaning, analysis, and interpretation they contribute contextual knowledge that deepens understanding. Analysts may disagree on how to interpret data, but then work together to reach consensus. Many data sets are so large that thorough exploration by a single person is unlikely. In short, social cognition plays a critical role in the process of scalable data analysis. New analysis tools that address human cognitive characteristics, social interaction and data analytics in an integrated fashion can improve our ability to turn data into knowledge.
Scalable data analysis requires social interaction and therefore social context must be embedded in data analysis tools. The goals of this project are (1) to understand how social interaction and social context can facilitate successful data analysis, (2) to develop models and tools for representing and annotating data transformations, visualizations, and social activity (e.g., textual and graphical annotations, discussions, links, tags), and (3) to design and test visual interfaces that leverage our tools to support collaborative analysis practices, including data entry, transformation, visualization, and interpretation. Central concerns include (a) a focus on enabling social interaction throughout the data life-cycle and (b) the use of scalable data transformation routines that can return results in a time frame concordant with interactive, exploratory data transformation and analysis.
Related Publications
Acknowledgements
This material is based upon work supported by the National Science Foundation under collaboarative Grants No. CCF-0963922, CCF-0964173 and CCF-1355723.