underlap

I am Glyn Normington, a retired software developer, interested in helping others improve their programming skills.

TSMC is a Taiwanese semiconductor producer. According to a recent article in Wired: 'Every six months, just one of [TSMC's] 13 fabrication plants “carves and etches” a quintillion transistors just for Apple.'

But according to TSMC's own blog in August 2020: “Since [April 2018], we have manufactured 7nm chips for well over 100 products from dozens of customers. It is enough silicon to cover more than 13 Manhattan city blocks, and with more than a billion transistors per chip, this is true Exa level, or over one quintillion 7nm transistors.”

Did Wired really get their figure wrong by at least a factor of 5? There seems to be confusion over the timescale as well as over which customers were involved. Or have TSMC stepped up their productivity massively in the last three years?

The Wired article also makes this eye-popping claim: “the semiconductor industry churns out more objects in a year than have ever been produced in all the other factories in all the other industries in the history of the world.” This I can believe.

An article Bring Back Personal Blogging¹ on The Verge makes this provocative statement:

Personal stories on personal blogs are historical documents when you think about it. They are primary sources in the annals of history, and when people look back to see what happened during this time in our lives, do you want The New York Times or Washington Post telling your story, or do you want the story told in your own words?

I have no illusions that my own writing has any historical interest, but others have pointed out that we can't tell what's of historical interest until a couple of hundred years have elapsed.

With that in mind, it seems most blogs are ephemeral. If the content staying available depends on some company, the company may well cease to exist. If the content is on my personal site, possibly under a rented domain name, then when I stop paying for the site or domain name, the content will cease to exist.

I guess some popular blogs may make it into the internet archive, for posterity, but they are not likely to include mine.

Reply via email

Postscript

James pointed out that some of my posts are already archived and it's easy to archive more.

Footnote: 1. via Start a F***ing Blog via Does a Blog Need to Integrate

I'm sympathetic to the idea behind Bring Back Blogging, even though I find there's more inertia to writing a blog post than posting, say, on Mastodon.

But it's tricky to know what to do about hosting.

I potentially could host my own blog, but then I'd have the costs associated with a hosting service and renting a domain name. I'd be responsible for regularly upgrading the operating system and blogging software to avoid security exposures. If I wanted to split the cost of hosting with others, I'd have to provide them with some kind of support. Also, when I eventually stop hosting, my posts and those of anyone else sharing the service would cease to exist, so any important posts would need to be moved elsewhere first. Finally, if I hosted my own blog, that could be the thin end of the wedge and I'd be tempted to host my own Mastodon instance, etc.

The alternative to hosting my own blog is to use a commercial blogging site such as Blogger (which I used regularly over seven years ago), Medium, or WordPress. But I find the commercial aspect of these a little distasteful. Unless I paid to use them, and possibly even if I did pay, my writing would be exploited by these platforms by subjecting my readers to advertisements, promotions, or other visual clutter.

For now, I'll stick with wordsmith.social and try to find out who pays for it and whether I can contribute to their costs.

Reply via email

It's easy to tidy up branches which have been deleted on a remote using the prune switch on git fetch, e.g.:

git fetch --all --prune

but deleting the corresponding local branches is less convenient.

Until today, I used git branch to spot local branches and then:

git branch -d branch-name

to delete them, one at a time.

But today I found this solution on stackoverflow which deletes all local branches that have been merged into main:

git branch --no-contains main --merged main | xargs git branch -d

A recent thread on mastodon got me thinking about my experience with concurrent programming. Here's a thumbnail sketch of the approaches I've tried.

Compare-and-swappery

Some of the first concurrent code I wrote involved managing multiple threads of execution using compare and swap instructions. This was hard work, but I wasn't tempted to try anything particularly complex because it was just too hard to reason about. This was before the days of unit testing¹, so developers were used to spending a lot of time thinking about the correctness of code before attempting to run it.

Model checking

One way of reasoning about concurrent code was to model the behaviour in CSP and then use a model checker like FDR to check various properties. Unfortunately, even relatively simple concurrent code took quite a bit of effort to model in CSP. Also, model checking, even with FDR's amazing “compressions”, tended to take too long unless the state space could be kept manageable. So with this approach I again tended to spend a lot of time thinking, this time about how to structure the CSP model to keep model-checking tractable. The result was I only produced one or two limited CSP models.

I would say the main benefit of CSP modelling is that it makes you aware of the main types of concurrency bugs: deadlock (where all or part of the system seizes up permanently), livelock (where the system gets into some kind of unending, repetitive behaviour), and more general kinds of divergence (e.g. where the system spends its time “chattering” internally without making useful progress).

Memory models

Java has various low-level locking mechanisms for managing concurrency. The Java memory model gives a good framework for reasoning about concurrent code in Java. Again the emphasis was on reasoning and it was hard work, but at least there was the sense that it was well founded.

Channels and goroutines

I've used Go a lot and would say goroutines (similar to lightweight threads, sometimes called “Green threads”) and channels are deceptively simple. The principle is that you can safely write to, and read from, a channel in distinct goroutines. It's easy to build concurrent systems that work most of the time, although it's hard to be sure they are bug free. But at least you're better off than in a language which only provides low-level mutexes and such like.

Language support

Rust guarantees safe access to shared data at compile time. The main difficulty is getting used to the constraints imposed by this model and then designing your code appropriately. That said, I haven't written much concurrent code in Rust, so I'll simply defer to the book.

Reply via email

Footnote 1: When I did write the occasional unit test, I also deleted it to avoid having to maintain it!

You can avoid checking in certain files in a git project directory by using a .gitignore file.

Then there are files which you don't want to check in on any project directory, such as editor/IDE configuration files. Instead of “polluting” .gitinore files with such entries, it's better to set up a global .gitignore file:

$ touch ~/.gitignore
$ git config --global core.excludesfile ~/.gitignore

Here's the contents of mine:

*~
.DS_Store
.idea
*.iml
\#*#
*.hsp
*.sav
*.scpt
/scratch/
.vscode/
coverage.out
.Guardfile
.config.ru

(You can see some of my history there: macOS, IntelliJ, VS Code, Go, etc.)

But notice this entry:

/scratch/

This means that any directory in the project named scratch will be ignored, along with its contents.

Having such a git scratch directory turns out to be really handy:

  • The files are visible to your editor/IDE.
  • It's easier to remember where you put such files compared to storing them outside the project directory.
  • You can even nest directories in the scratch directory.
  • If you finish with the project and delete the project directory, the files are cleaned up too.

A recent example is from a Rust project:

$ cargo expand > scratch/generated.rs

This puts the generated code in a file which my editor will recognise as Rust code, and will display with syntax highlighting. I definitely didn't want to check that file in!

Other files which are suitable for the scratch directory are:

  • Hacky tests or fixtures I'm too embarrassed to check in.
  • Dependencies I don't control, but which I need to modify, e.g. for debugging.
  • TODO lists and other rough notes.
  • Output files from static analysis or code coverage.
  • Old versions of code files from the project which I want to refer to quickly.
  • Old project executables for comparing with the current behaviour.
  • Downloaded PDF manuals relating to the project.

I'm sure you'll find many other uses for git scratch directories.

Reply via email

A micro-commit is a commit of a small/incremental code change, which ideally also passes the tests.

Micro-commits combine nicely with TDD as follows: 1. Write a failing test 2. Make all the tests pass 3. Commit 4. Refactor to make the code clean 5. Commit

Why commit so often?

Mark Seeman's recent stackoverflow blog “Use Git tactically” uses one of my favourite analogies to coding: rock climbing. The basic idea of micro-commits is to work with small, safe changes and commit after each one, much as rock climbers regularly secure their ropes to the rock face.

If something goes wrong, there's a smaller distance to fall. In coding, something going wrong, such as a regression creeping into the code, can be recovered from by returning to an earlier commit without losing much work in the process.

Other benefits

Using micro-commits reduces the stress of coding. Not only do you know you're not going to lose much work if you mess things up, but by focussing on one small step at a time, you don't have to keep a lot of information in your head.

Micro-commits also help you stay focussed on making just one change at a time. To avoid falling into the “While You’re At It” trap, described in Tim Ottinger's blog “What's this about Micro-commits?”, keep a TODO list, either in an issue or, for short-lived items, on a piece of paper.

Squash or merge?

After completing a fix, feature, or other pieces of work, it's important to decide what to do with all the micro-commits.

One approach is to squash them into one or more substantial commits. A downside of squashing is that there is then no record of the sequence of micro-commits, which could come in handy (e.g. for bisecting out a bad change) if some problem has crept in which didn't show up while running the unit tests. An advantage of squashing is that people reading the commit history can then see the wood for the trees.

Another approach is to merge the changes with a commit message that summarises the changes. It's hard to see the wood for the trees in this approach when reading a linear series of commit messages (such as produced by git log), but tools which show the commit structure make it easy to pick out the merge commits.

Commit messages

Commit messages for micro-commits are likely to be brief, but don't forget to reference any associated issue(s). You may want to treat the commit messages like a discussion thread and drop in the occasional comment on the overall design when it occurs to you rather than forget to include it when you finally squash or merge.

My favourite “morning paper” is “On formalism in specification”. Studying Bertrand Meyer's original paper, trying to avoid its “seven sins of specification”, and reading Strunk and White's “The Elements of Style” improved my technical writing enormously.

The first comment on the morning paper (by a certain David Parnas) raises an interesting question of whether there can be a “truly readable mathematical specification”. Sadly, I believe the answer is “no”, given many developers' difficulties with mathematics.

However, that hasn't stopped me writing mathematical specifications from time to time, often as a way of getting a basic understanding of an area of software before starting development. Two of which I am proud are “Image Registries” (download PDF) which nails down some of the basic terminology surrounding Docker and OCI registries and “OCI Image Format” (download PDF). The style of interspersing English and mathematics in these specifications might even make them readable by those who find mathematics off-putting.

The so-called “Dirty Pipe” CVE-2022-0847, detailed in Max Kellerman's article, was published on 7 March 2022. I recently upgraded my kernel to the stable version 5.16.11. So is there a new stable version with a fix?

According to a post on the kernel mailing list, the fix is in 9d2231c5d74e (lib/iov_iter: initialize "flags" in new pipe_buffer).

However, the change logs for 5.16.12 and 5.16.13 do not mention 9d2231c5d74e or lib/iov_iter. So has the fix still not made it into a stable kernel?

A UNIX stack exchange answer to question “Given a git commit hash, how to find out which kernel release contains it?” helped here. The github page for commit 9d2231c5d74e shows that, at the time of writing, the fix is part of v5.17-rc7 and v5.17-rc6. So it seems like the fix isn't yet available in a stable kernel.

Postscript: 21 March 2020

According to the releases page at kernel.org:

After each mainline kernel is released, it is considered “stable.”

v5.17, containing the fix, was released yesterday, so I upgraded to that.

This morning I followed these instructions to upgrade Ubuntu to a development release of 22.04 and the kernel to 5.16.11. I was hoping that a bug would be fixed, but it turn out not.

I was previously on 21.10:

$ lsb_release -a
LSB Version:	core-11.1.0ubuntu3-noarch:printing-11.1.0ubuntu3-noarch:security-11.1.0ubuntu3-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 21.10
Release:	21.10
Codename:	impish

with kernel 5.15.0-rc7:

$ uname -r
5.15.0-051500rc7-generic

I upgraded to a development branch of Ubuntu 22.04:

$ lsb_release -a
LSB Version:	core-11.1.0ubuntu3-noarch:printing-11.1.0ubuntu3-noarch:security-11.1.0ubuntu3-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu Jammy Jellyfish (development branch)
Release:	22.04
Codename:	jammy

and to a stable kernel version 5.16.11:

$ uname -r
5.16.11-051611-generic

If I find any issues, I'll mention them here.

Notes

The following are some notes of errors and interesting observations during the upgrade process.

Third party repository error

During the upgrade process, I notice the following error:

E: The repository 'http://ppa.launchpad.net/gezakovacs/ppa/ubuntu impish Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

This was caused because I installed UNetbootin from the above third party repository, which doesn't support all releases.

Failure upgrading installed packages

When running sudo apt-get upgrade, the system rebooted and I had to re-run the command and follow some recovery instructions to run:

$ sudo dpkg --configure -a
$ sudo apt-get upgrade
$ sudo apt --fix-broken install

Re-running sudo apt-get upgrade then succeeded.

Upgrading the distribution

Towards the end of sudo apt-get dist-upgrade, I notice the following output:

update-initramfs: Generating /boot/initrd.img-5.15.0-051500rc7-generic
update-initramfs: Generating /boot/initrd.img-5.13.0-30-generic
update-initramfs: Generating /boot/initrd.img-5.13.0-28-generic

So it seems the distribution upgrade is preserving my current kernel version as well as a couple of previously installed kernel versions.

Later, the following output:

* dkms: running auto installation service for kernel 5.15.0-18-generic

showed the upgrade was installing a later 5.15 kernel.