Fwd: [software-design-book] A Philosophy of Software Design

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [software-design-book] A Philosophy of Software Design

Tim Daly
---------- Forwarded message ----------
From: Tim Daly <[hidden email]>
Date: Mon, 15 Jul 2019 23:41:33 -0400
Subject: Re: [software-design-book] A Philosophy of Software Design
To: Brooks Moses <[hidden email]>
Cc: John Ousterhout <[hidden email]>, software-design-book
<[hidden email]>

RE: PROVING PROGRAMS CORRECT

First, I should point out that the "next generation" of students
are being taught to prove programs. I spent 3 years taking
classes at CMU and its all proof-related teaching. I'm now
taking an MIT course and each lecture is nothing but a proof
or two. See [0] and [1] which are from CMU courses this year.

>I'd make much the same argument for proving correctness,
>although it's a much weaker argument.  In my experience,
>proving correctness has always had the issue that we have
>to first define the correct behavior -- and that definition can
>also be buggy.

"Software Designers" SHOULD specify programs in some
sort of machine-checkable way. Tell me what you want and
I will prove my program does that. Show me the machine
checked spec and I will construct a proven program.

I'm not talking about "programmer tools", like bounds checks
and compile-time type checks. Those are programmer-level
tool quality issues. They are useful but irrelevant. Anything
Rust does is irrelevant.



>I'm sure there still remain some cases where manual proofs
>of correctness are worthwhile, and some cases where one
>wouldn't want to consider a machine-checkable specification
>to be part of the program, but I think they're both becoming
>fairly rare.

I think you didn't watch Guy Steele's talk. The number of papers
using logic notation (and logical development) is increasing at
a furious rate. CMU and MIT are teaching this stuff TODAY to
the next generation. Rather than "becoming fairly rare", it is "the
coming wave".

The software DESIGNER needs to tell me what the spec is
that my program needs to fulfill. If you don't know what you want,
you're sure to get something but...

>teaching these using the same methods as one would
>have used thirty-odd years ago is much like going from the idea
>that structured programming is valuable to teaching RATFOR.

They aren't teaching these like structured programming. They
are teaching these with formal methods. The classes are
teaching the "Calculus of Inductive Construction" (CIC) which
your compiler can't check. These are not "special classes" but
are spread across every class they teach.

[0] Look at "Programming in Martin-Lof's Type Theory"
http://www.cse.chalmers.se/research/group/logic/book/book.pdf

or, to see a "non-programming course", such as self-driving
car software, see:

[1] "Logical Foundations of Cyber-Physical Systems"
https://www.springer.com/gp/book/9783319635873

This was published late in 2018 and is the "state of the art" in
things like robots. If you can't read the proof on page 236
"Sequent Calculus proof shape for a bouncing ball" then you
are not "keeping your saw sharp" and the kids won't hire you.

Is your login software secure? Prove it.
Is your websocket recoverable? Prove it.
Is your JSON reader correct? Prove it.
Is your phone banking app robust? Prove it.
Is your home camera private? Prove it.

If someone calls themselves a software DESIGNER then
I want a DESIGN that specifies what my program must do.
And I want a DESIGN that I can prove my program fulfills.

The days of "mm, yeah, we sorta need a banking thing
in the browser" are dying.

Tim


On 7/15/19, Tim Daly <[hidden email]> wrote:

> RE: LITERATE PROGRAMMING
>
> CODE ORGANIZATION AND IDEs
>
> "The first of these is fundamentally obsolete as a separate tool"...
>
> I wrote code before directories were invented. Code had to be
> broken in less than 4k bytes as the operating system and editor
> lived in the other 4k. So code had to be written in what I call
> Piles-of-Sand (POS). Directories came along and people built
> tools (make) to handle POS code. Clever naming of directories
> carry all the "semantics" (e.g. src, doc, lib, test, build, etc)
>
> You now have 16 gig of memory but you program in POS.
> You have "layered" an IDE on the POS but still write in sand.
>
> Can you navigate a code base if your IDE doesn't work?
> Axiom is 1.2 million lines of code, considered "small" these days.
>
> IDEs, like all code, tend to die. I used to work in Visual Basic
> with an IDE from Microsoft. The IDE is dead (as is the VB code).
>
> I used an old IDE from Microsoft that used to generate all of the
> "glue code" to the visual front end (don't remember its name).
> All I had to do was fill in various code blocks. The IDE died,
> as has the code I wrote in it.
>
>
> THE LONG TERM
>
> Why is this a problem? Well, the REAL problem doesn't show
> up until the code is older. Ten years from now your IDE won't
> run on anything. So your "sematic directory names" (the
> "chapters / sections" organization), and your navigation is
> gone (your "index"). The flow of your comments is scattered
> (the "paragraph organization"). Your "automatic comment
> extractor / formatter" is gone.
>
> Your spiffy new IDE has NO idea how to format
> "Forth / Smalltalk / Pascal / Swift / Visual Basic /
> SNOBOL / WATFIV / PL/I / MASM / or ... Rust.
>
> So you're busy writing "dead" code.
>
> Almost everything I've written over the last 50 years in
> over 60 languages no longer runs. But all of my Common
> Lisp code still runs. As does my Lisp/VM code and
> MACLISP code due to macros.
>
> To understand the long term view of your code find a calculus,
> physics, or chemistry text. Cut it up so all you keep are the
> equations on little index cards. Throw the rest away. THAT
> is your code in 10 years. Can you learn calculus from a box
> of randomly arranged index cards containing equations?
>
> Sure, the equations are "right" and "type correct" but the
> shiny-new programming language you're using (Rust?) has
> no standard. Even if it did (e.g. C++) you'll find that there
> are a dozen "standards" (C99, C11, C14, C17, etc), some
> of which are no longer supported (e.g. C99, C11) because
> the libraries have died.
>
> The 10-year result is that your code is dead the day YOU
> stop typing. Look at github / sourceforge / savannah. There
> are thousand of programs that died the day the lead
> developer stopped.
>
> CONCLUSION
>
> LIterate programming is trivial to use. I have use my editor
> (emacs) to write latex with embedded code.
>
> I type 'make' every 10 lines of changes (code or text) to:
> (1) extracts the code
> (2) compile it
> (3) test it
> (4) recreated the book with chapters/sections/paragraphs/
>      index/tables/graphics all hyperlinked.
> (5) displays it as a PDF (usually just updates the already
>      displayed PDF with new changes)
>
> The day *I* stop coding you have a book that explains the
> whole state of my code (the calculus equations) as well as
> the whole state of my organization (the chapters/etc) and
> the whole state of my thinking.
>
> I work in tools (Latex, Emacs, Common Lisp) that have
> survived for decades with the same functionality. They will
> likely survive in the future unchanged. My "literate
> programming" tool chain is trivial (a 172 line C program) to
> extract code and 'make' to build it.
>
> Literate programs are only useful if you want your code
> to live.
>
> I HIGHLY recommend:
>
> Lisp in Small Pieces, a complete lisp compiler / interpreter /
> debugger in literate form.
>
> https://www.amazon.com/Lisp-Small-Pieces-Christian-Queinnec-ebook/dp/B00AKE1U6O/ref=sr_1_1?keywords=lisp+in+small+pieces&qid=1563245147&s=gateway&sr=8-1
>
> Physically Based Rendering, a graphics rendering engine
> in literate form
>
> https://www.amazon.com/Physically-Based-Rendering-Theory-Implementation/dp/0128006455/ref=sr_1_2?crid=1RKWRJMHFO2NX&keywords=physically+based+rendering&qid=1563245225&s=gateway&sprefix=Physically+Based%2Caps%2C187&sr=8-2
>
> Tim
>
> On 7/15/19, Brooks Moses <[hidden email]> wrote:
>> Tim,
>>
>> In my opinion, one of the interesting things about both literate
>> programming and proving correctness is that explicitly doing them is less
>> useful today than it was when they were invented, because a lot of the
>> useful parts have been incorporated into programming languages and
>> "normal"
>> toolchains.  So here's a counterpoint argument to your points:
>>
>> I've been a fan of literate programming since I discovered it, to the
>> point
>> where I wrote my own tool to process the Fortran 95 code from my thesis
>> work (Mech. Eng., not C.S.) as literate programs.  However, now that I've
>> been programming professionally for a decade, I don't really miss it.
>>
>> From what I remember, there were three significant pieces to the
>> "literate
>> programming" toolchain as Knuth invented it.  The piece that I consider
>> the
>> major innovation was that Knuth's literate-programming tool allowed one
>> to
>> organize the code in an order that made sense to humans, rather than what
>> the programming language he was using (Pascal, IIRC) required.  Second,
>> the
>> tool allowed one to produce cross-references somewhat automatically.
>> Third, the text around the program was written in LaTeX rather than plain
>> text, allowing formatting and mathematics.
>>
>> The first of these is fundamentally obsolete as a separate tool.  Modern
>> programming languages and programming style do not require code to be
>> written in the same computer-centric linear fashion as the Pascal of
>> Knuth's day required.  We write code in tree structures (directories and
>> files), which are much easier to navigate than a single linear document.
>> The files are roughly the same size as the sections that Knuth wrote for
>> his literate programs, and within them there's a lot more freedom to
>> organize things in ways that are meaningful to the human readers.  So,
>> basically, this functionality is just an inherent part of the language
>> and
>> compiler.
>>
>> The second of these is also obsolete in its literate-programming form.
>> Knuth's cross-references were designed for what was fundamentally a
>> paper-centric style of reading code; they point to numbered sections in a
>> linear document, with the expectation that one will turn pages to get to
>> the relevant section.  Today, we have IDEs where we can click on a symbol
>> and it will automatically pull up a list of places this symbol is
>> referenced, and with a click it will open the relevant file at the
>> location
>> of the relevant reference -- and this happens without any need for
>> annotations on the part of the programmer.
>>
>> The third of these is the one piece that I think is sometimes still
>> useful.  The ability to write equations in the comments was pretty
>> critical
>> for the computational-fluid-dynamics code in my dissertation.  On the
>> other
>> hand, outside of deeply mathematical regimes, the ability to embed
>> equations is not a "killer feature".  And the fact that the text around
>> the
>> program requires a "formatting" step is a problem -- it means that the
>> tool
>> in which I read the program is not the tool in which I edit the program,
>> and so I have to have both forms open on my screen and mentally map
>> between
>> them in order to get any value from the formatted text in understanding
>> the
>> context around my edits.  The existence of two forms of the document also
>> makes it hard (at least for me) to have mental landmarks of "where things
>> are in the file", so I'm having to think about finding things at least
>> twice as often.  I always found that to be a serious enough drawback that
>> I
>> pretty much only used the formatted form of the document for rare
>> only-reading cases.
>>
>> Beyond the tooling, Knuth's literate programming was also of course a
>> radically different way of actually writing programs.  It involved
>> writing
>> large high-level comments explaining the purpose and implementation of
>> the
>> section of code first, before writing the low-level implementation --
>> that
>> is, addressing why the code does what it does in an essay-like way.  For
>> the "why" part, this is pretty much what the modern programming-style
>> guidelines I'm familiar with say that comments should be.  We have large
>> comment blocks at the top of files explaining the purpose of the code in
>> the file, and similar comment blocks at the top of classes and functions.
>> The header files for the heavily-used APIs I work with tend to be written
>> similarly, with functions grouped in human-significant ways with
>> generally
>> more comment than code explaining the intended use.
>>
>> For the "what" part, this again is something strongly influenced by the
>> programming languages Knuth was working with.  Pascal of the day was
>> fairly
>> low-level, and it could be difficult to "see the forests for the trees".
>> The computer science community has basically considered this to be a bug
>> in
>> the programming languages, and so there is far less need for this sort of
>> comment.  The machine code is now human-readable at a reasonably high
>> level, so disadvantages of having duplicate definitions of things (once
>> in
>> the code, once in the comments) now much more commonly outweigh the
>> advantages of having "what this code does" comments.
>>
>> So, IMO, a large part of the innovations of literate programming have
>> been
>> incorporated into standard programming languages and tools, and much of
>> the
>> rest is reasonably well-accepted as the "right way" to write comments,
>> and
>> so "literate programming" as a distinct thing is no longer especially
>> relevant.
>>
>> I'd make much the same argument for proving correctness, although it's a
>> much weaker argument.  In my experience, proving correctness has always
>> had
>> the issue that we have to first define the correct behavior -- and that
>> definition can also be buggy.  There are some aspects of correct behavior
>> that are easy because they reasonably universal: Programs should not do
>> out-of-bounds accesses to arrays, they should not leak memory, they
>> should
>> not have threading deadlocks, call sites and function definitions should
>> agree on what arguments are being passed, and so on.  However, many bugs
>> in
>> practice are cases where something occurred that we didn't expect to
>> happen
>> when we wrote the specification, and so a proof of correctness would fail
>> to find the bug.  So I find it hard to talk usefully about "proving
>> correctness" without some information about what sort of correctness
>> we're
>> talking about.
>>
>> For the reasonably-universal things, there have been a lot of advances on
>> this in programming-language design.  A strong type system is,
>> fundamentally, a compiler-provided proof of certain kinds of correctness.
>> Languages such as Rust take this much farther; if the compiler cannot
>> prove
>> the correctness of the program in a number of ways, it is considered an
>> invalid program and you get a compile error.  (And, because of this, we
>> have a fair bit of data about the inherent costs of this -- enough to
>> support many debates and arguments about the tradeoffs!  But it's
>> absolutely clear that there are some tradeoffs.)  I think what's
>> happening
>> is that the idea of a "specification that can be proven" is slowly being
>> subsumed into the programming language itself, rather than being
>> something
>> separate -- see, for instance, type annotations in Python, and
>> static_assert in C++, although those are both fairly small steps.
>>
>> I'm sure there still remain some cases where manual proofs of correctness
>> are worthwhile, and some cases where one wouldn't want to consider a
>> machine-checkable specification to be part of the program, but I think
>> they're both becoming fairly rare.
>>
>> In general, I don't disagree with the idea that literate programming and
>> explicitly proving programming correctness are valuable.  But I think
>> what's fundamentally valuable there is the underlying ideas, and teaching
>> these using the same methods as one would have used thirty-odd years ago
>> is
>> much like going from the idea that structured programming is valuable to
>> teaching RATFOR.
>>
>> - Brooks
>>
>>
>>
>>
>> On Mon, Jul 15, 2019 at 10:52 AM John Ousterhout <[hidden email]>
>> wrote:
>>
>>> Hi Tim,
>>>
>>> Thanks for the comments. Here are a few overall followups.
>>>
>>> * You expressed a concern that deep classes might be harder to maintain
>>> and modify. If this were the case, then deep classes would be a bad
>>> idea.
>>> However, my experience is that deep classes make it *easier* to maintain
>>> and modify software. Deep classes don't have to be more complicated than
>>> shallow ones, and well-designed deep code is mostly non-task-specific
>>> (see
>>> Chapter 6 of APOSD).
>>>
>>> * You argued for proving correctness. I'm not opposed to this, but as of
>>> today I believe it is too difficult and time-intensive to be practical
>>> in
>>> most environments. In other words, if you're looking for the lowest-cost
>>> way to achieve a given level of functionality, reliability, etc., you
>>> can
>>> probably get there more cheaply using the more intuitive design
>>> approaches
>>> I advocate than using formal verification. It feels like verification is
>>> making gradual progress, so perhaps this situation will change in the
>>> future. Also, for systems that require extremely high levels of
>>> reliability, it may be that formal verification is the only way to get
>>> there today.
>>>
>>> * You suggested analyzing existing open source projects rather than
>>> writing new code. There is certainly value in analyzing existing code,
>>> but
>>> in terms of students learning how to design, it's crucial that they
>>> write
>>> code themselves and get feedback on it. I would draw an analogy with
>>> writing. Reading great books can be helpful in learning to write, but it
>>> is
>>> nowhere near sufficient: you have to write stuff yourself, make
>>> mistakes,
>>> get feedback, and fix the mistakes.
>>>
>>> -John-
>>>
>>> On Mon, Jul 15, 2019 at 5:29 AM Tim Daly <[hidden email]> wrote:
>>>
>>>> I'm watching your Google talk on youtube.
>>>>
>>>> I've been programming for 50 years. I've done free software development
>>>> since 1996. I've written a LOT of code in over 60 languages. I have
>>>> co-authored
>>>> 4 commercial products. So I feel like I need to comment.
>>>>
>>>> BORN or LEARN?
>>>>
>>>> First, you ask if programming is something you are born with or
>>>> something
>>>> you can learn. I think I could teach anyone who can follow a cooking
>>>> recipe
>>>> to program. Learning to program is easy.
>>>>
>>>> But programming is hard. It requires an inborn talent which is the
>>>> ability to
>>>> cope with very high levels of frustration and ambiguity. I have spent a
>>>> whole
>>>> week chasing a bug only to find that it is a compiler bug (4 times so
>>>> far).
>>>> If you can't handle the ground-pounding frustration of failing software
>>>> which
>>>> leads you to scream "WHY DOESN"T THIS WORK" and still continue then
>>>> you will never be a programmer.
>>>>
>>>> I whole-heartedly agree with the value of code reviews for learning.
>>>>
>>>> DEEP CLASSES
>>>>
>>>> I tend to both agree and disagree with your deep class idea. I agree
>>>> that it is a useful idea for design. But I think it is a bad idea for
>>>> coding.
>>>> Deep classes have a lot of complexity and are extremely specific to
>>>> the problem to be solved. Unfortunately, most of the lifetime of code
>>>> is post-development. Maintaining deep-class code is nearly impossible
>>>> because it is so task specific.
>>>>
>>>> I prefer deeply layered code. See, for example, Sarker [0] where the
>>>> development is incremental and deeply layered. But the intellectual
>>>> steps are small and easily adapted. I find this form of development
>>>> easier to do, easier to teach, and easier to maintain. The most
>>>> productive programmers I know write small but working pieces of
>>>> code that makes incremental steps of improvement. Note that this
>>>> is NOT a tactical approach with ad hoc decisions, but small steps
>>>> toward the ultimate goals.
>>>>
>>>> I took a course (at UCONN) where the prof gave us the spec of a
>>>> multitasking operating system. He gave us 10 weeks (in teams) to
>>>> develop it. It had to run on bare hardware (these days, an Arduino).
>>>> We had to process 100 "batch programs" to be run in minumum
>>>> time, assuming they block for I/O, etc. We developed a minimal
>>>> Read-Schedule-Process-Print loop and then enhanced it bit by bit.
>>>> You might consider that as an example project.
>>>>
>>>> So while I agree that a Deep Class DESIGN is seems like a good
>>>> short term idea, I think it costs much more in the long term due to
>>>> the high maintain / modify costs.
>>>>
>>>> Designers need to consider the full software lifecycle, not just the
>>>> initial implementation. You would not like an automobile that was
>>>> glued together and could not be fixed.
>>>>
>>>> LOGIC and PROOF
>>>>
>>>> In my later years I have watched the growth of proving
>>>> programs correct. I think designers need to write specifications
>>>> that can be proven. There is a HUGE growth in this field. See
>>>> Guy Steele [3] invited talk.
>>>>
>>>> Designers need to be deeply educated in program proof and
>>>> typing. At UCONN I took a course that just gave us a pile of
>>>> research papers. We each were assigned 3 papers from the
>>>> pile. Each paper had to be presented to the class in a 20
>>>> minute talk. You were graded on your 3 presentations. We
>>>> not only learned the theory, we learned to read (and write)
>>>> research literature.
>>>>
>>>> FOSS CONTRIBUTION
>>>>
>>>> There is an alternative to wasting class time developing new
>>>> code to review.
>>>>
>>>> You could structure a class for designers that took an
>>>> open source codebase from github for analysis. The whole
>>>> codebase would be reviewed throughout the semester and
>>>> each person who led the section could post fixes from the
>>>> code reviews to the open source site. Not only would they
>>>> see real design issues, they would learn to participate in
>>>> open source (and, incidently, how to maintain with source
>>>> code control).
>>>>
>>>> LITERATE PROGRAMMING
>>>>
>>>> On another track, I think you should teach literate programming.
>>>> All it takes is a simple latex macro and a trivial C[1] or Lisp[2]
>>>> program to extract code from a latex document. Future designers
>>>> out to follow the wisdom of Knuth and learn to write for humans
>>>> and, incidently, for computers.
>>>>
>>>> Tim Daly
>>>> http://daly.axiom-developer.org
>>>> Carnegie Mellon University
>>>>
>>>> [0] Sarker, Dipanwita and Waddell, Oscar and Dybvig, R. Kent
>>>> "A Nonpass Infrastructure for Compiler Education" (2004)
>>>> 9th ACM SIGPLAN, pp 201-212
>>>>
>>>> [1] http://daly.axiom-developer.org/tanglec.c
>>>>
>>>> [2] http://daly.axiom-developer.org/tangle.lisp
>>>>
>>>> [3] https://www.youtube.com/watch?v=dCuZkaaou0Q
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups
>>>> "software-design-book" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an
>>>> email to [hidden email].
>>>> To post to this group, send email to
>>>> [hidden email].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/software-design-book/CAJn5L%3DKzPb4vHKfJFCfJjmNbtSjEJmnmFNUHB_JLx2iPKGxgUQ%40mail.gmail.com
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups
>>> "software-design-book" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an
>>> email to [hidden email].
>>> To post to this group, send email to
>>> [hidden email]
>>> .
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/software-design-book/CAGXJAmzxj8wsVk9Ei9HQopWiLZCA7HXEL%2Bijxb7JnXfV4AMmqA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/software-design-book/CAGXJAmzxj8wsVk9Ei9HQopWiLZCA7HXEL%2Bijxb7JnXfV4AMmqA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>

_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer