GeoEDF: A Framework for Designing and Executing Geospatial Research Workflows
Date: August 30, 2022 Time: 11:00 AM (Central Time)
Geospatial researchers often spend much time wrangling data in their workflows. These data wrangling processes are typically ad-hoc, involving a mix of desktop, interactive computing, and HPC tools, requiring data to be transferred from one platform to another. Notwithstanding the time spent away from doing research, these processes become hard to maintain amidst research group turnovers, ultimately affecting research reproducibility. GeoEDF is designed to address these challenges by enabling researchers to conceive their workflows as an abstract sequence of data acquisition and processing operations. Reusable workflow building blocks (namely, data connectors and data processors) implement suitably generalized data acquisition and processing operations that can be composed into concrete workflows. The GeoEDF workflow engine plans and executes these workflows in HPC environments, abstracting away the complexities of intermediate data transfer, HPC job scheduling, and parallelization.
In this webinar, we will provide an overview of the GeoEDF project and its implementation, GeoEDF workflow syntax and semantics, and the cyberinfrastructure platforms that can currently be used to develop and execute GeoEDF workflows. We will demonstrate data connectors and processors through examples from I-GUIDE convergence science use cases and discuss GeoEDF integration pathways with other cyberinfrastructure platforms.
He is a Research Scientist in the Scientific Solutions Group at the Rosen Center for Advanced Computing (RCAC), Purdue University. He leads and works on several federally funded projects at the intersection of advanced computing and science and engineering. Rajesh also has over 15 years of experience as a full-stack application developer and has worked on science gateway projects for a variety of domains, including geosciences, cybersecurity, communication, and anthropology. Rajesh currently serves as CoPI on five NSF-funded grants, including Anvil, a Category I capacity HPC system funded by the NSF in 2020 and operated by RCAC. He is the Software Architect for the GeoEDF workflow engine.
He is a Master’s student in Information Security at Purdue. He is currently in his final semester here, working on his thesis in the field of Social Engineering. He enjoys working on CTFs, and playing video games and is an avid sports fan. Gaurav has developed several GeoEDF data connectors and processors as a Research Assistant in the Scientific Solutions Group.