The scientific challenge for this project is to accelerate discovery and exploration of the synthetic biology design space. In particular, many parts used in synthetic biology come from or are initially tested in a simple bacteria, E. coli, but many potential applications in energy, agriculture, materials, and health require either different bacteria or higher level organisms (yeast for example). Currently, researchers use a trial-and-error approach because they cannot find reliable information about prior experiments with a given part of interest. This process simply cannot scale. Therefore, to achieve scale, a wide range of data must be harnessed to allow confidence to be determined about the likelihood of success. The quantity of data and the exponential increase in the publications generated by this field is creating a tipping point, but this data is not readily accessible to practitioners. To address this challenge, our multidisciplinary team of biological engineers, machine learning experts, data scientists, library scientists, and social scientists will build a knowledge system integrating disparate data and publication repositories in order to deliver effective and efficient access to collectively available information; doing so will enable expedited, knowledge-based synthetic biology design research.
SD2E began as the DARPA SD2 program Environment for enabling advanced scientific modeling and computation. The Synergistic Discovery and Design (SD2) program is focused upon developing data-based approaches for accelerating scientific discovery and the design of robust models in new domains of research. More information on SD2 can be found here.
SD2E now serves SD2 and other related data-driven scientific programs. SD2E consists of this web-portal; a web-based research workbench, a RESTful APIs (Tapis) and function-as-a-service (Abaco) linking computational applications and workflows with command line access and control via web-portal; high performance advanced computational and data storage hardware; and the skilled personnel supporting scientists in their use of the computational resources. Additional tools incorporated into SD2E include JupyterHub, Gitlab, Jenkins, Redash, and Synbiohub. Access control to data and software is maintained at multiple levels, allowing private user-only access during initial testing, followed by project level shared access, and finally publishing capabilities.
SD2E is managed by the Texas Advanced Computing Center (TACC), where many of the world’s most powerful research computing resources are designed and operated. More information on TACC resources can be found here.
This research aims at advancing probabilistic verification techniques for the rigorous design of dependable systems in synthetic biology and nanotechnology. Major goals of the project include the following. First, scale up stochastic model checking with efficient and accurate state space truncation techniques. Secondly, investigate practical stochastic counterexample generation techniques and utilize them to improve the accuracy of the state reductions. Thirdly, derive automated guidance mechanisms learned from stochastic counterexamples to improve the quality and efficiency of rare-event stochastic simulations. Lastly, integrate our proposed framework within existing state-of-the-art stochastic model checking tools, PRISM and STORM; and evaluate the proposed methodology on a wide range of case studies derived from synthetic biology and nanotechnology applications. The combination of these methods into this new methodology is being explored for the first time. Altogether, this research will improve the accuracy of analysis of infinite state stochastic systems with rare-event properties.