osdi 2021 accepted papers

Ruffenach Funeral Home, 21st And Snyder, Cbp Ufce Authorized Equipment List, Articles O

In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. Our evaluation shows that NrOS scales to 96 cores with performance that nearly always dominates Linux at scale, in some cases by orders of magnitude, while retaining much of the simplicity of a sequential kernel. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. Conference Dates: Apr 12, 2021 - Apr 14, 2021. Swapnil Gandhi and Anand Padmanabha Iyer, Microsoft Research. First, Fluffy mutates and executes multi-transaction test cases to find consensus bugs which cannot be found using existing fuzzers for Ethereum. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, vendors and teachers of operating system technology. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. If you have any questions about conflicts, please contact the program co-chairs. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. Papers accompanied by nondisclosure agreement forms will not be considered. 2019 - Present. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Questions? However, your OSDI submission must use an anonymized name for your project or system that differs from any used in such contexts. For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. We implemented the ZNS+ SSD at an SSD emulator and a real SSD. Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review. Pages should be numbered, and figures and tables should be legible in black and white, without requiring magnification. Poor data locality hurts an application's performance. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! Camera-ready submission (all accepted papers): 15 Mars 2022. The hybrid segment recycling chooses a proper block reclaiming policy between segment compaction and threaded logging based on their costs. See the USENIX Conference Submissions Policy for details. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. This talk will discuss several examples with very different solutions. Authors are required to register abstracts by 3:00 p.m. PST on December 3, 2020, and to submit full papers by 3:00 p.m. PST on December 10, 2020. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. In 2023 I started another two-year term on the . This paper addresses a key missing piece in the current ecosystem of decentralized services and blockchain apps: the lack of decentralized, verifiable, and private search. The co-chairs may then share that paper with the workshops organizers and discuss it with them. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. Uniquely, Dorylus can take advantage of serverless computing to increase scalability at a low cost. As a result, the design of a file system with respect to space management and crash consistency is simplified, requiring only 10.8K LOC for full functionality. Papers must be in PDF format and must be submitted via the submission form. Title Page, Copyright Page, and List of Organizers | USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. DMon speeds up PostgreSQL, one of the most popular database systems, by 6.64% on average (up to 17.48%). Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. Table of Contents | While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Although the number of submissions is lower than the past, it's likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. CLP's gains come from using a tuned, domain-specific compression and search algorithm that exploits the significant amount of repetition in text logs. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Authors of each accepted paper must ensure that at least one author registers for the conference, and that their paper is presented in-person at the conference. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. Timothy Roscoe is a Full Professor in the Systems Group of the Computer Science Department at ETH Zurich, where he works on operating systems, networks, and distributed systems, and is currently head of department. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. OSDI is "a premier forum for discussing the design, implementation, and implications of systems software." A total of six research papers from the department were accepted to the . Graph Neural Networks (GNNs) have gained significant attention in the recent past, and become one of the fastest growing subareas in deep learning. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . To evaluate the security guarantees of Storm, we build a formally verified reference implementation using the Labeled IO (LIO) IFC framework. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . Four months after we reported the bugs to Geth developers, one of the bugs was triggered on the mainnet, and caused nodes using a stale version of Geth to hard fork the Ethereum blockchain. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Research Petuum CASL research and engineering team's Pollux technical paper on adaptive scheduling for optimized. Despite having the same end goals as traditional ML, FL executions differ significantly in scale, spanning thousands to millions of participating devices. Main conference program: 5-8 April 2022. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. We discuss the design and implementation of TEMERAIRE including strategies for hugepage-aware memory layouts to maximize hugepage coverage and to minimize fragmentation overheads. This is especially true for DPF over Rnyi DP, a highly composable form of DP. Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary Conference site 49 papers accepted out of 251 submitted. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. Contact your program co-chairs, [email protected], or the USENIX office, [email protected]. While compiler-based techniques have been proposed to improve data locality, they depend on heuristics, which can sometimes hurt performance. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. In addition, CLP outperforms Elasticsearch and Splunk Enterprise's log ingestion performance by over 13x, and we show CLP scales to petabytes of logs. All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. We present DistAI, a data-driven automated system for learning inductive invariants for distributed protocols. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury, University of Michigan. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. If you are uncertain about how to anonymize your submission, please contact the program co-chairs, [email protected], well in advance of the submission deadline. We prove that DistAI is guaranteed to find the -free inductive invariant that proves the desired safety properties in finite time, if one exists. She also invented the spanning tree algorithm, which transformed Ethernet from a technology that supported a few hundred nodes, to something that can support large networks. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. OSDI brings together professionals from academic and industrial backgrounds in a premier forum for discussing the design, implementation, and implications of systems software. See the Preview Session page for an overview of the topics covered in the program. Authors may upload supplementary material in files separate from their submissions. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. However, Addra improves message latency in this architecture, which is a key performance metric for voice calls. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. Paper abstracts and proceedings front matter are available to everyone now. This approach misses possible optimization opportunities as transformations that only preserve equivalence on subsets of the output tensors are excluded. Authors are also encouraged to contact the program co-chairs, [email protected], if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues. A hardware-accelerated thread scheduler makes sub-nanosecond decisions, leading to high CPU utilization and low tail response time for RPCs. Secure hardware enclaves have been widely used for protecting security-critical applications in the cloud. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. Manuela will present examples and discuss the scope of AI in her research in the finance domain. . Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. Responses should be limited to clarifying the submitted work. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. For conference information, . Authors may submit a response to those reviews until Friday, March 5, 2021. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. (Visa applications can take at least 30 working days to process.) GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. Writing a correct operating system kernel is notoriously hard. However, existing enclave designs fail to meet the requirements of scalability demanded by new scenarios like serverless computing, mainly due to the limitations in their secure memory protection mechanisms, including static allocation, restricted capacity and high-cost initialization. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. (Registered attendees: Sign in to your USENIX account to download these files. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors both at the per-job level and at the cluster-wide level. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions.