Title: The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design
Abstract: One area in which systems for machine learning are wanting (TensorFlow, PyTorch) is in their support for Big Models and Big Data. In contrast to modern relational systems which scale to large data sizes and multiple machines quite well out of the box, getting machine learning computations to work in a distributed setting or with large models is often very challenging. In this talk, I argue that the fundamental problem is lack of abstraction in these systems. I argue that it makes sense to re-design these systems from the ground up, applying many of the lessons from the heyday of relational database system design in the 1970’s and 80’s.
Bio: Chris Jermaine is a Professor of Computer Science at Rice University, and directs Rice’s Data Science Initiative. He is the recipient of an Alfred P. Sloan Foundation Research Fellowship, a National Science Foundation CAREER award, and the George R. Brown School’s Teaching & Research Excellence Award. He has received best paper/best paper runner-up awards from top journals/conferences in data mining and data management, including IEEE ICDE, ACM SIGMOD, ACM SIGKDD, and VLDB, as well as the IBM Pat Goldberg Award given annually to the best papers published by IBM. He currently serves as the editor-in-chief of ACM Transactions on Database Systems, the ACM’s flagship journal for data management research.
Title: What is special about spatial data science and Geo-AI?
Abstract: The importance of spatial data science and Geo-AI is growing with the rise of spatial and spatiotemporal big data (e.g., trajectories, remote-sensing images, census and geo-social media). Societal use cases include Agriculture ( global crop monitoring, precision agriculture), Location-based services (e.g., navigation, ride-sharing), Public Health (e.g., monitoring disease spread), Environment and Climate (change detection, land-cover classification), Smart Cities (e.g., mapping buildings), etc.
Classical data science and AI (e.g., machine learning) often perform poorly when applied to spatial data sets because of the many reasons. First, spatial data is embedded in a continuous space and classical statistics (e.g., correlation) are not robust to the modifiable areal unit problem. Second, spatial data-items have extended footprints (e.g., line strings, polygons) and implicit relationships (e.g., distance, touch). Third, high cost of spurious patterns requires guardrails (e.g., statistical significance tests) to reduce false positives. Furthermore, spatial autocorrelation and variability violate the classical assumption of data samples being generated independently from identical distributions, which risk models that are either inaccurate or inconsistent with the data.
Thus, new methods are needed to analyze spatial data. This talk surveys common and emerging methods for spatial classification and prediction (e.g., spatial autoregression, spatial variability aware neural networks), as well as techniques for discovering interesting, useful and non-trivial patterns such as hotspots (e.g., circular, linear, arbitrary shapes ), interactions (e.g., co-locations , cascade , tele-connections ), spatial outliers, and their spatio-temporal counterparts.
Bio: Shashi Shekhar, a McKnight Distinguished University Professor at the University of Minnesota and an U.C. Berkeley alumnus, is a leading scholar of spatial computing and Geographic Information Systems (GIS). He is serving on the Computing Research Association (CRA) board, and as a co-Editor-in-Chief of Geo-Informatica journal (Springer). Earlier, he served as the President of the University Consortium for GIS (UCGIS), and on many National Academies’ committees. Recognitions include IEEE-CS Technical Achievement Award, UCGIS Education Award, IEEE Fellow and AAAS Fellow. Contributions include algorithms for evacuation route planning and spatial pattern (e.g., colocation, linear hotspots) mining, an Encyclopedia of GIS, a Spatial Databases textbook, and a spatial computing book for professionals.