Dept. of Computer Science
http://hdl.handle.net/2142/10755
Thu, 23 Sep 2021 19:20:57 GMT2021-09-23T19:20:57ZTrustworthy Program Verification via Proof Generation
http://hdl.handle.net/2142/110344
Trustworthy Program Verification via Proof Generation
Zhengyao Lin; XIaohong Chen; Minh-Thai Trinh; John Wang; Grigore Rosu
In an ideal language framework, language designers only need to define the formal semantics of their languages. Deductive program verifiers and other language tools are automatically generated by the framework. In this paper, we propose a novel approach to establishing the correctness of these autogenerated verifiers via proof generation. Our approach is based on the K language framework and its logical foundation, matching logic. Given a formal language semantics in K, we translate it into a corresponding matching logic theory. Then, we encode formal verification tasks as reachability formulas in matching logic. The correctness of one verification task is then established, on a case-by-case basis, by automatically generating a rigorous, machine-checkable mathematical proof of the associated reachability formula. Experiments with our proof generation prototype on various verification tasks in different programming languages show promising performance and attest to the feasibility of the proposed approach.
Semantic framework; Proof generation; Proof checking; Program verification
Sun, 22 Aug 2021 00:00:00 GMThttp://hdl.handle.net/2142/1103442021-08-22T00:00:00ZZhengyao LinXIaohong ChenMinh-Thai TrinhJohn WangGrigore RosuExploring Design Alternatives for Replicated RAMP Transactions Using Maude
http://hdl.handle.net/2142/110189
Exploring Design Alternatives for Replicated RAMP Transactions Using Maude
Liang, Lei; Liu, Si
Developing correct, scalable, and fault-tolerant distributed databases is hard and labor-intensive. The increasing complexity of such systems under modern cloud infrastructures, e.g., geo-replicated multi-partitioned datacenters, further limits the number of design alternatives that can be explored in practice. There is therefore a need for the formal analysis of both their qualitative properties, e.g., data consistency, and their quantitative properties, e.g., latency, at an early design stage.
In this paper we use formal modeling and both standard and statistical model checking techniques to explore the design space of replicated RAMP (Read Atomic Multi-Partition) transactions for geo-replicated databases. Specifically, we develop in Maude formal, executable specifications of three replicated RAMP designs, two by the RAMP developers and one by us, and analyze their data consistency properties. We further transform the Maude models into probabilistic rewrite theories for statistical model checking w.r.t. performance properties. Our results: (i) are consistent with the conjectures made by the RAMP developers; (ii) uncover our promising new design that not only provides all crucial data consistency guarantees but also outperforms the other design alternatives.
Distributed Transaction Systems; Performance Estimation; Data Consistency; Formal Specification; Statistical Model Checking; Maude; Rewriting Logic
Fri, 23 Jul 2021 00:00:00 GMThttp://hdl.handle.net/2142/1101892021-07-23T00:00:00ZLiang, LeiLiu, SiTowards a Trustworthy Semantics-Based Language Framework via Proof Generation
http://hdl.handle.net/2142/109927
Towards a Trustworthy Semantics-Based Language Framework via Proof Generation
Xiaohong Chen; Zhengyao Lin; Minh-Thai Trinh; Grigore Rosu
We pursue the vision of an ideal language framework, where programming language designers only need to define the formal syntax and semantics of their languages, and all language tools are automatically generated by the framework. Due to the complexity of such a language framework, it is a big challenge to ensure its trustworthiness and to establish the correctness of the autogenerated language tools. In this paper, we propose an innovative approach based on proof generation. The key idea is to generate proof objects as correctness certificates for each individual task that the language tools conduct, on a case-by-case basis, and use a trustworthy proof checker to check the proof objects. This way, we avoid formally verifying the entire framework, which is practically impossible, and thus can make the language framework both practical and trustworthy. As a first step, we formalize program execution as mathematical proofs and generate their complete proof objects. The experimental result shows that the performance of our proof object generation and proof checking is very promising.
Semantic framework; Proof generation; Proof checking
Thu, 01 Apr 2021 00:00:00 GMThttp://hdl.handle.net/2142/1099272021-04-01T00:00:00ZXiaohong ChenZhengyao LinMinh-Thai TrinhGrigore RosuHyperproperties in Matching Logic
http://hdl.handle.net/2142/109298
Hyperproperties in Matching Logic
Tusil, Jan; Chen, Xiaohong; Rosu, Grigore
Matching logic is a uniform logic to specify and reason about programming languages and program properties. Many important logics and/or formal systems have been shown to be definable in matching logic as logical theories. However, no research has been conducted to studying how hyperproperties can be treated in matching logic. In this paper, we give the first theoretical result that shows that HyperLTL (hyper linear temporal logic), which is an important temporal logic designed for specifying and reasoning about hyperproperties, can be completely captured by matching logic. Our result demonstrates that matching logic offers a uniform treatment to handling hyperproperties and to supporting their model checking problems.
hyperproperties; matching logic
Fri, 01 Jan 2021 00:00:00 GMThttp://hdl.handle.net/2142/1092982021-01-01T00:00:00ZTusil, JanChen, XiaohongRosu, GrigoreTechnical Report: Decidable Fragments of Matching Logic
http://hdl.handle.net/2142/109255
Technical Report: Decidable Fragments of Matching Logic
Rodrigues, Nishant; Chen, Xiaohong; Rosu, Grigore
Matching logic is a unifying logic aimed at defining programming
language semantics, and reasoning about various program and language
properties. It is a general logic designed with minimalism in mind. With
only eight syntactic constructs, matching logic can define many
important logical frameworks and languages as its theories. Yet, to our
knowledge, no research has been conducted into the decidabiltiy of
matching logic. In this paper, we begin such an initiative with respect
to decidable fragments of matching logic and identify the first
non-trivial decidable fragment for the empty theory. Our decision
procedure extends a tableau system for modal \(\mu\)-calculus. We also
give an implementation of the proposed decision procedure and show that
with modifications, it can be extended to support theories with certain
axioms.
matching logic; modal logic; temporal logic; decidability
Tue, 02 Feb 2021 00:00:00 GMThttp://hdl.handle.net/2142/1092552021-02-02T00:00:00ZRodrigues, NishantChen, XiaohongRosu, GrigoreNvidia fellowship proposal - Sketching: a Cognitively inspired Compositional Theorem Prover that Learns
http://hdl.handle.net/2142/109175
Nvidia fellowship proposal - Sketching: a Cognitively inspired Compositional Theorem Prover that Learns
Brando Miranda
Vision & motivation: Theorem proving has important applications in hardware and software verification. In hardware verification, it has been used for integrated circuit design [7, 8]. For software verification, a major success with theorem provers has been with the creation of CertC; a verified compiler for C; [9]. In the past, companies like Intel have made major investments in formal methods to guarantee the processors they manufacture do not have major floating-point bugs -- an important example of this is the Pentium FDIV bug in 1994 that cost them $500M [11]. As a consequence, theorem proving has been used to verify floating-point firmware [10]. But to effectively use this tool one requires highly trained human experts in the corresponding theorem prover and its area of application (e.g. using the HOL Light theorem prover for hardware verification). However, learnable automated theorem proving promises to revolutionize hardware and software verification by: 1) increasing the automatization of the challenging task of theorem proving and 2) by increasing the adaptability of such methods and therefore widening their use and applicability through learning. With this I envision that Nvidia can be a leader in this field by providing high quality verified GPU hardware, safe compilation of CUDA code, verified CUDA kernels, and exemplary verified interfaces of Nvidia GPU code for higher-level languages like python. With these vast immediately impactful applications I can see Nvidia leading the way to the highest quality products and saving millions of dollars in a fiercely competitive industry with increasing demands from gaming and deep learning applications.
In addition, theorem proving has had a major role in human history with the development of the scientific method, mathematics and technology. Thus, the last major part of my vision is to further empower humanity with powerful automation tools that can take all those areas even further. In addition, with a system that is capable of learning its applications are nearly endless. For example, it could play the crucial role of guaranteeing safe AI. Nvidia can become one of the first to pave the way forward in this exciting field and have a long-lasting impact for a richer, safer future.
meta-learning; agi; machine learning; theorem proving
Fri, 11 Sep 2020 00:00:00 GMThttp://hdl.handle.net/2142/1091752020-09-11T00:00:00ZBrando MirandaAn Empirical Study of Meta-Learning: a step towards rigorously understanding meta-learning algorithms
http://hdl.handle.net/2142/109139
An Empirical Study of Meta-Learning: a step towards rigorously understanding meta-learning algorithms
Brando Miranda
It has been recently observed that a good embedding is all we needed to solve many few-shot learning benchmarks.
In addition, other work has strongly suggested that MAML mostly works via this same method: by learning a good embedding.
This highlights our lack of understanding of what meta-learning algorithms are doing and when they work.
In this work we provide preliminary results that shed some light towards understanding meta-learning algorithms better.
In particular we identify 3 interesting properties:
1) It's possible to define a synthetic task that results in higher degree of meta-adaptation, thus suggesting that
current few-shot learning benchmarks might not have the properties needed for the success of meta-learning algorithms
2) meta-overfitting occurs when the number of classes (or concepts) are finite and this issue disappears once the task has an unbounded number of concepts
3) more adaptation for MAML does not necessarily result in representations that have adapted more or even perform better.
Finally, we suggest that to understand meta-learning algorithms better it is imperative that we go beyond tracking only absolute performance and in addition formally quantify the degree of meta-learning and track both metrics together.
Reporting results in future work this way should help us identify the sources of meta-overfitting more accurately and hopefully design more flexible meta-learning algorithms.
In the appendix we also discuss that quantifying AI safety too is important but is left as future work.
meta-learning; ai; agi; artificial general intelligence; learning to learn; few shot learning; machine learning
Wed, 23 Dec 2020 00:00:00 GMThttp://hdl.handle.net/2142/1091392020-12-23T00:00:00ZBrando MirandaEvaluating sublinear estimators for big data
http://hdl.handle.net/2142/109135
Evaluating sublinear estimators for big data
Brando Miranda
Increasingly, databases are storing more and more data, making it costly to go through all the data one may have in a database. However, users are still interested in being able to query a database holding their data to get some understanding of the data that they have. In this paper we propose three different sampling-based methods to estimate the total mean value of one particular attribute in a particular group of records in a data set. First we approximate the number of elements pertaining to one group and then, estimate their mean value. With these two approximated quantities, we can easily estimate the total amount one group contributes by multiplying both averages. We will also argue the correctness of the algorithms that we propose. We evaluate each algorithm in practice by comparing them on real data.
big data; sublinear algorithms; data bases
Fri, 16 May 2014 00:00:00 GMThttp://hdl.handle.net/2142/1091352014-05-16T00:00:00ZBrando MirandaSketching: a Cognitively inspired Compositional Theorem Prover that Learns to Prove - a Proposal
http://hdl.handle.net/2142/109134
Sketching: a Cognitively inspired Compositional Theorem Prover that Learns to Prove - a Proposal
Brando Miranda
Mathematics is a powerful tool that has benefited humanity for millenia. Proficiency in this subject has far reaching implications for society because physics, mechanical engineering, computer science, statistics, machine learning and the scientific method itself depend on it. For example, being able to automate mathematical proofs would allow us to create safe and interpretable A.I. because a system with provable guarantees is a system we understand and can trust. This is why in this proposal we suggest a novel way of implementing the concept of sketching for automated theorem provers using powerful ideas from cognitive science and artificial neural networks; these ideas are: compositionality, learning-to-learn, learning as model building and curriculum-learn.
meta-learning; machine learning; sketching; theorem proving; agi; ai safety; AI; artificial general intelligence; automatic theorem proving
Wed, 23 Dec 2020 00:00:00 GMThttp://hdl.handle.net/2142/1091342020-12-23T00:00:00ZBrando MirandaEstablishing the foundations of Meta-learning - a Proposal
http://hdl.handle.net/2142/109133
Establishing the foundations of Meta-learning - a Proposal
Brando Miranda
General Artificial Intelligence (AGI) has the potential to be one of the most transformative technologies we are yet to develop. It can affect us in every way, from doing daily housework to advanced theorem proving. Meta-learning, also known as “learning to learn”, is the subfield of machine learning that studies the design of intelligent agents that learn to adapt to novel situations rapidly. In essence, meta-learning should be driving the progress towards AGI - since the ability to learn general skills rapidly is at the core of general intelligence. Although deep learning has deservedly been at the forefront of the current AI revolution, it's extensions to the field of meta-learning have lagged behind to deliver such AGI technologies. Instead its explosion in popularity has delivered a vast variety of methods of increasing complexity that were recently shown [1, 2, 3] be easily matched by algorithms that arguably exhibit no explicit meta-learning (e.g. pre-training a highly-parameterized multi-layer neural network and then fine tuning the final layer at test time [2]). I believe that to overcome this challenge we have to: 1) design universal metrics and benchmarks that quantify the core goal of meta-learning1 2) develop a scientific framework to understand meta-learning as novel skill acquisition and 3) incorporate AI safety metrics to the global research development cycle of meta-learning (arguably the appropriate and important place to include such a metric in an actionable way as soon as possible).
meta-learning; agi; general intelligence; AI safety; machine learning; learning to learn; ai
Wed, 23 Dec 2020 00:00:00 GMThttp://hdl.handle.net/2142/1091332020-12-23T00:00:00ZBrando Miranda