This tutorial will provide an overview of the key aspects that an organization should have in place to build an enterprise level BDA program. The workshop will provide very specific and hands-on material to help organizational and educational institutions embrace BDA in an organized manner, following best practices. It is organized into these main areas:

  • BDA Overview – We will start with key definitions and differentiation of various BDA approaches and terminology.
  • Analytical Modeling – Selected key topics pertaining model selection and model specification will be discussed. BDA has evolved rapidly and there is a wide variety of methods and approaches. We divide the related decisions into two categories: (a) model selection – overview of various modeling methods and approaches, depending on the specific analytic problems and goals; and (b) model specification – overview of how to select the most effective set of predictors and the data pre-processing that may be required. In this section we will also describe in depth the concept of machine learning and discuss various modern approaches. We will provide a detailed handout with a summary of predictive modeling approaches.
  • Analytics Capabilities – We will provide an overview of popular body of knowledge and analytic frameworks. We will also provide an overview of the most popular BDA tools, both commercial and open source. We will also provide a summary of R scripts to fit and evaluate some of the most popular machine learning and predictive models and introduce the concept of cross-validation for model fitting and evaluation and parameter tuning.
  • BDA Management – We will address issues related to the alignment of business needs with organizational structures, rules, compliance, governance and life cycle models.

SWT Leaders:

Frank Armour (Primary Contact)
American University
farmour@american.edu

J. Alberto Espinosa
American University
alberto@american.edu

Which researcher has not wished, when a key lab member graduates, that they could avoid consuming precious time and resources in training new people about seemingly mundane data and tool issues?  Good data and software management ensures preservation of knowledge and smooth continuation in the context of staff and student turn-over.  Researchers who do not or cannot share data still benefit from good data and software stewardship practices, inspired by the FAIR guiding principles.

Good stewardship practices are required to ensure that data is properly described for re-use, stored in medium and formats accessible to future users, and simply not lost in the lab. Establishing good data and software management practices provide researchers from all disciplines with the following benefits:

  • Data, software, and knowledge preservation throughout the life cycle of a project and beyond
  • Improved training and communication through tools and protocols
  • Systematic training practices that can be tailored to individual labs or projects
    Improved collaboration within teams through well-established data and software development practices
  • Ease of re-use for future users, from future students, lab members, external collaborators, and stakeholders
  • Compliance with funding agency requirements
  • Interoperability with publishing platforms and fulfillment of publishers’ requirements
  • Data and software re-use embedded in current and future practices

In this tutorial, participants will learn state-of-the-art stewardship practices to support preservation and reproducibility. The tutorial will be organized around three mini-sessions, each with a short demo or presentation of tools, and the bulk of time devoted to a hands-on participatory activities. The tutorial will conclude with a panel of open questions, including suggestions for future areas of emphasis for this tutorial.

For more information about the tutorial, please visit https://sites.google.com/view/data-stewardship-and-reuse/home

SWT Leaders:

Line Pouchard (Primary Contact)
Brookhaven National Laboratory
pouchard@bnl.gov

Natalie Meyers
University of Notre Dame
natalie.meyers@nd.edu

[Presentation Slides from HICSS-52]

The volume and complexity of available data is increasing at an unprecedented rate. However, the human ability to analyze and comprehend data remains constant. Visual Analytics is the science of analytical reasoning facilitated by interactive visual interfaces. It combines scientific investigation of information processing in human-computer cognitive systems with the design and implementation of interactive visualization that support this processing, building upon research methods and theories from computer science, management information systems, and the cognitive, perceptual, and social sciences.

The goals of this tutorial are to inform and guide researchers and practitioners in the practice of interdisciplinary visual analytics research and development, and to discuss the far-reaching and practical applications of visual analytics technologies.

Talks will focus on the perceptual, cognitive, and communication theories that can inform VA interaction design and on practical approaches for building, customizing and deploying visual analytics in organizations. The half-day session introduces VA with an emphasis on existing methods for research and methods and visual analytics at large scale.

Agenda

  • Introduction to Visual Analytics: Current Research and Practices
  • Applications and Issues in Large-scale Visualization and Visual Analytics
  • An interactive panel discussion of the future directions for visual analytics as a field

SWT Leaders:

Kelly Gaither (Primary Contact)
University of Texas at Austin
kelly@tacc.utexas.edu

David Ebert
Purdue University
ebertd@ecn.purdue.edu

A personal AI agent (PAIA) belonging to each of us could maximize the value of our personal data by fully utilizing the data to enhance the quality of our life and work, while disclosing the data to others in limited cases. The whole value created thereof could exceed 30% of GDP. The production and operation of such PAIAs therefore would be by far the most profitable business. This PAIA business is likely to prosper within a decade, aggregating our personal data to each of us to let our PAIAs utilize our data to assist or perhaps control us. Most of our actions of selecting and purchasing something would then follow our PAIAs’ decisions because we are all lazy without exception. Governance of PAIAs is hence an essential issue concerning our privacy, human rights, welfare, economy, politics, culture, and so forth.

This symposium will shed lights on such risks and benefits of PAIA from diverse viewpoints encompassing computer science, marketing science, ethics, legal science, sociology, among others. Below are some topics to be addressed.

  • Technical foundation of PAIA
  • Social acceptability of PAIA and PAIA business
  • Governance of personal data using PAIA
  • Governance of PAIA for both personal and societal values
  • Role of PAIA in Society 5.0

For more information about this symposium, please visit http://www.sict.i.u-tokyo.ac.jp/members/hasida/pai/

SWT Leaders

Koiti Hasida (Primary Contact)
The University of Tokyo and RIKEN
hasida.koiti@i.u-tokyo.ac.jp

Keiko Toya
Meiji University
i.ktoya@gmail.com

Ayako Kato
Toyo University

Hiroshi Nakagawa
RIKEN
hiroshi.nakagawa@riken.jp

This tutorial will provide participants with a solid introduction to the opportunities and challenges of text mining, using both commercial and open-source tools, and help to situate these approaches within the broader context of big data analytics.

At the end of this tutorial, participants will be able to:

  • discuss the role of text mining in big data analytics;
  • understand the main opportunities and challenges in text analysis;
  • articulate several conceptual approaches to text analytics;
  • explain how to automate the process of collecting textual data;
  • describe commercial and open source software appropriate for text mining.

There will be a hands-on session, please bring computer and data!

SWT Leaders

Derrick Cogburn (Primary Contact)
American University
dcogburn@american.edu

Normand Peladeau
Provalis Research Corp
peladeau@provalisresearch.com

Michael Hine
Carleton University
mhine@sprott.carleton.ca