Fwd: [software-design-book] A Philosophy of Software Design

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [software-design-book] A Philosophy of Software Design

Tim Daly
---------- Forwarded message ----------
From: Brooks Moses <[hidden email]>
Date: Mon, 15 Jul 2019 16:33:29 -0700
Subject: Re: [software-design-book] A Philosophy of Software Design
To: John Ousterhout <[hidden email]>
Cc: Tim Daly <[hidden email]>, software-design-book
<[hidden email]>

Tim,

In my opinion, one of the interesting things about both literate
programming and proving correctness is that explicitly doing them is less
useful today than it was when they were invented, because a lot of the
useful parts have been incorporated into programming languages and "normal"
toolchains.  So here's a counterpoint argument to your points:

I've been a fan of literate programming since I discovered it, to the point
where I wrote my own tool to process the Fortran 95 code from my thesis
work (Mech. Eng., not C.S.) as literate programs.  However, now that I've
been programming professionally for a decade, I don't really miss it.

From what I remember, there were three significant pieces to the "literate
programming" toolchain as Knuth invented it.  The piece that I consider the
major innovation was that Knuth's literate-programming tool allowed one to
organize the code in an order that made sense to humans, rather than what
the programming language he was using (Pascal, IIRC) required.  Second, the
tool allowed one to produce cross-references somewhat automatically.
Third, the text around the program was written in LaTeX rather than plain
text, allowing formatting and mathematics.

The first of these is fundamentally obsolete as a separate tool.  Modern
programming languages and programming style do not require code to be
written in the same computer-centric linear fashion as the Pascal of
Knuth's day required.  We write code in tree structures (directories and
files), which are much easier to navigate than a single linear document.
The files are roughly the same size as the sections that Knuth wrote for
his literate programs, and within them there's a lot more freedom to
organize things in ways that are meaningful to the human readers.  So,
basically, this functionality is just an inherent part of the language and
compiler.

The second of these is also obsolete in its literate-programming form.
Knuth's cross-references were designed for what was fundamentally a
paper-centric style of reading code; they point to numbered sections in a
linear document, with the expectation that one will turn pages to get to
the relevant section.  Today, we have IDEs where we can click on a symbol
and it will automatically pull up a list of places this symbol is
referenced, and with a click it will open the relevant file at the location
of the relevant reference -- and this happens without any need for
annotations on the part of the programmer.

The third of these is the one piece that I think is sometimes still
useful.  The ability to write equations in the comments was pretty critical
for the computational-fluid-dynamics code in my dissertation.  On the other
hand, outside of deeply mathematical regimes, the ability to embed
equations is not a "killer feature".  And the fact that the text around the
program requires a "formatting" step is a problem -- it means that the tool
in which I read the program is not the tool in which I edit the program,
and so I have to have both forms open on my screen and mentally map between
them in order to get any value from the formatted text in understanding the
context around my edits.  The existence of two forms of the document also
makes it hard (at least for me) to have mental landmarks of "where things
are in the file", so I'm having to think about finding things at least
twice as often.  I always found that to be a serious enough drawback that I
pretty much only used the formatted form of the document for rare
only-reading cases.

Beyond the tooling, Knuth's literate programming was also of course a
radically different way of actually writing programs.  It involved writing
large high-level comments explaining the purpose and implementation of the
section of code first, before writing the low-level implementation -- that
is, addressing why the code does what it does in an essay-like way.  For
the "why" part, this is pretty much what the modern programming-style
guidelines I'm familiar with say that comments should be.  We have large
comment blocks at the top of files explaining the purpose of the code in
the file, and similar comment blocks at the top of classes and functions.
The header files for the heavily-used APIs I work with tend to be written
similarly, with functions grouped in human-significant ways with generally
more comment than code explaining the intended use.

For the "what" part, this again is something strongly influenced by the
programming languages Knuth was working with.  Pascal of the day was fairly
low-level, and it could be difficult to "see the forests for the trees".
The computer science community has basically considered this to be a bug in
the programming languages, and so there is far less need for this sort of
comment.  The machine code is now human-readable at a reasonably high
level, so disadvantages of having duplicate definitions of things (once in
the code, once in the comments) now much more commonly outweigh the
advantages of having "what this code does" comments.

So, IMO, a large part of the innovations of literate programming have been
incorporated into standard programming languages and tools, and much of the
rest is reasonably well-accepted as the "right way" to write comments, and
so "literate programming" as a distinct thing is no longer especially
relevant.

I'd make much the same argument for proving correctness, although it's a
much weaker argument.  In my experience, proving correctness has always had
the issue that we have to first define the correct behavior -- and that
definition can also be buggy.  There are some aspects of correct behavior
that are easy because they reasonably universal: Programs should not do
out-of-bounds accesses to arrays, they should not leak memory, they should
not have threading deadlocks, call sites and function definitions should
agree on what arguments are being passed, and so on.  However, many bugs in
practice are cases where something occurred that we didn't expect to happen
when we wrote the specification, and so a proof of correctness would fail
to find the bug.  So I find it hard to talk usefully about "proving
correctness" without some information about what sort of correctness we're
talking about.

For the reasonably-universal things, there have been a lot of advances on
this in programming-language design.  A strong type system is,
fundamentally, a compiler-provided proof of certain kinds of correctness.
Languages such as Rust take this much farther; if the compiler cannot prove
the correctness of the program in a number of ways, it is considered an
invalid program and you get a compile error.  (And, because of this, we
have a fair bit of data about the inherent costs of this -- enough to
support many debates and arguments about the tradeoffs!  But it's
absolutely clear that there are some tradeoffs.)  I think what's happening
is that the idea of a "specification that can be proven" is slowly being
subsumed into the programming language itself, rather than being something
separate -- see, for instance, type annotations in Python, and
static_assert in C++, although those are both fairly small steps.

I'm sure there still remain some cases where manual proofs of correctness
are worthwhile, and some cases where one wouldn't want to consider a
machine-checkable specification to be part of the program, but I think
they're both becoming fairly rare.

In general, I don't disagree with the idea that literate programming and
explicitly proving programming correctness are valuable.  But I think
what's fundamentally valuable there is the underlying ideas, and teaching
these using the same methods as one would have used thirty-odd years ago is
much like going from the idea that structured programming is valuable to
teaching RATFOR.

- Brooks




On Mon, Jul 15, 2019 at 10:52 AM John Ousterhout <[hidden email]>
wrote:

> Hi Tim,
>
> Thanks for the comments. Here are a few overall followups.
>
> * You expressed a concern that deep classes might be harder to maintain
> and modify. If this were the case, then deep classes would be a bad idea.
> However, my experience is that deep classes make it *easier* to maintain
> and modify software. Deep classes don't have to be more complicated than
> shallow ones, and well-designed deep code is mostly non-task-specific (see
> Chapter 6 of APOSD).
>
> * You argued for proving correctness. I'm not opposed to this, but as of
> today I believe it is too difficult and time-intensive to be practical in
> most environments. In other words, if you're looking for the lowest-cost
> way to achieve a given level of functionality, reliability, etc., you can
> probably get there more cheaply using the more intuitive design approaches
> I advocate than using formal verification. It feels like verification is
> making gradual progress, so perhaps this situation will change in the
> future. Also, for systems that require extremely high levels of
> reliability, it may be that formal verification is the only way to get
> there today.
>
> * You suggested analyzing existing open source projects rather than
> writing new code. There is certainly value in analyzing existing code, but
> in terms of students learning how to design, it's crucial that they write
> code themselves and get feedback on it. I would draw an analogy with
> writing. Reading great books can be helpful in learning to write, but it is
> nowhere near sufficient: you have to write stuff yourself, make mistakes,
> get feedback, and fix the mistakes.
>
> -John-
>
> On Mon, Jul 15, 2019 at 5:29 AM Tim Daly <[hidden email]> wrote:
>
>> I'm watching your Google talk on youtube.
>>
>> I've been programming for 50 years. I've done free software development
>> since 1996. I've written a LOT of code in over 60 languages. I have
>> co-authored
>> 4 commercial products. So I feel like I need to comment.
>>
>> BORN or LEARN?
>>
>> First, you ask if programming is something you are born with or something
>> you can learn. I think I could teach anyone who can follow a cooking
>> recipe
>> to program. Learning to program is easy.
>>
>> But programming is hard. It requires an inborn talent which is the
>> ability to
>> cope with very high levels of frustration and ambiguity. I have spent a
>> whole
>> week chasing a bug only to find that it is a compiler bug (4 times so
>> far).
>> If you can't handle the ground-pounding frustration of failing software
>> which
>> leads you to scream "WHY DOESN"T THIS WORK" and still continue then
>> you will never be a programmer.
>>
>> I whole-heartedly agree with the value of code reviews for learning.
>>
>> DEEP CLASSES
>>
>> I tend to both agree and disagree with your deep class idea. I agree
>> that it is a useful idea for design. But I think it is a bad idea for
>> coding.
>> Deep classes have a lot of complexity and are extremely specific to
>> the problem to be solved. Unfortunately, most of the lifetime of code
>> is post-development. Maintaining deep-class code is nearly impossible
>> because it is so task specific.
>>
>> I prefer deeply layered code. See, for example, Sarker [0] where the
>> development is incremental and deeply layered. But the intellectual
>> steps are small and easily adapted. I find this form of development
>> easier to do, easier to teach, and easier to maintain. The most
>> productive programmers I know write small but working pieces of
>> code that makes incremental steps of improvement. Note that this
>> is NOT a tactical approach with ad hoc decisions, but small steps
>> toward the ultimate goals.
>>
>> I took a course (at UCONN) where the prof gave us the spec of a
>> multitasking operating system. He gave us 10 weeks (in teams) to
>> develop it. It had to run on bare hardware (these days, an Arduino).
>> We had to process 100 "batch programs" to be run in minumum
>> time, assuming they block for I/O, etc. We developed a minimal
>> Read-Schedule-Process-Print loop and then enhanced it bit by bit.
>> You might consider that as an example project.
>>
>> So while I agree that a Deep Class DESIGN is seems like a good
>> short term idea, I think it costs much more in the long term due to
>> the high maintain / modify costs.
>>
>> Designers need to consider the full software lifecycle, not just the
>> initial implementation. You would not like an automobile that was
>> glued together and could not be fixed.
>>
>> LOGIC and PROOF
>>
>> In my later years I have watched the growth of proving
>> programs correct. I think designers need to write specifications
>> that can be proven. There is a HUGE growth in this field. See
>> Guy Steele [3] invited talk.
>>
>> Designers need to be deeply educated in program proof and
>> typing. At UCONN I took a course that just gave us a pile of
>> research papers. We each were assigned 3 papers from the
>> pile. Each paper had to be presented to the class in a 20
>> minute talk. You were graded on your 3 presentations. We
>> not only learned the theory, we learned to read (and write)
>> research literature.
>>
>> FOSS CONTRIBUTION
>>
>> There is an alternative to wasting class time developing new
>> code to review.
>>
>> You could structure a class for designers that took an
>> open source codebase from github for analysis. The whole
>> codebase would be reviewed throughout the semester and
>> each person who led the section could post fixes from the
>> code reviews to the open source site. Not only would they
>> see real design issues, they would learn to participate in
>> open source (and, incidently, how to maintain with source
>> code control).
>>
>> LITERATE PROGRAMMING
>>
>> On another track, I think you should teach literate programming.
>> All it takes is a simple latex macro and a trivial C[1] or Lisp[2]
>> program to extract code from a latex document. Future designers
>> out to follow the wisdom of Knuth and learn to write for humans
>> and, incidently, for computers.
>>
>> Tim Daly
>> http://daly.axiom-developer.org
>> Carnegie Mellon University
>>
>> [0] Sarker, Dipanwita and Waddell, Oscar and Dybvig, R. Kent
>> "A Nonpass Infrastructure for Compiler Education" (2004)
>> 9th ACM SIGPLAN, pp 201-212
>>
>> [1] http://daly.axiom-developer.org/tanglec.c
>>
>> [2] http://daly.axiom-developer.org/tangle.lisp
>>
>> [3] https://www.youtube.com/watch?v=dCuZkaaou0Q
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "software-design-book" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To post to this group, send email to
>> [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/software-design-book/CAJn5L%3DKzPb4vHKfJFCfJjmNbtSjEJmnmFNUHB_JLx2iPKGxgUQ%40mail.gmail.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "software-design-book" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> To post to this group, send email to [hidden email]
> .
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/software-design-book/CAGXJAmzxj8wsVk9Ei9HQopWiLZCA7HXEL%2Bijxb7JnXfV4AMmqA%40mail.gmail.com
> <https://groups.google.com/d/msgid/software-design-book/CAGXJAmzxj8wsVk9Ei9HQopWiLZCA7HXEL%2Bijxb7JnXfV4AMmqA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

_______________________________________________
Axiom-developer mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/axiom-developer