This evolving document contains our best practices for building medical imaging software. If you do not understand or agree with anything in it, bring it up in Slack so we can improve upon it.
Software engineering requires us to make tradeoffs; there are rarely situations where a best-practice or technology is always the right choice.
- Usually writing unit tests is appropriate, but sometimes it isn’t.
- Somteims writing detailed commit messages is wortwhile, but not always.
- Sometimes Linux is appropriate, sometimes Windows is.
As engineers, we need to make tradeoffs to maximize the value we deliver to our clients. Making the right tradeoffs is difficult. We usually don’t know the full business context, and our clients usually don’t usually grasp the subtleties of the technical tradeoffs.
Software has both external and internal qualities. External qualities are visible to users. Internal qualities are only visible to developers. This distinction is often blurry because internal qualities often affect external qualities.
Here is a list of external qualities of a piece of software:
- Cost How much did it cost to create and maintain?
- Correct Does the software do what it was supposed to do?
- Usable Is the software easy to use? There are many tradeoffs even within usability; what makes software easy for new users may conflict with what is easy for power users.
- Performant Does the software make good use of system resource such as memory, disk space, network, process time?
- Reliable Is the software bug-free?
- Secure Does the software keep hackers and unauthorized users out of it?
- Adaptable Can the system be altered to work without modification in different environments; usually this means that it is appropriately configurable.
- Robust Does the system continue to work in the presence of invalid input or stressful environmental conditions.
Here is a list of internal qualities of a piece of software:
- Clarity Can you easily tell what the code is supposed to do?
- Maintainable How easy is it to keep the software working over time and to include new features?
- Portable How easy is it to port the software to other environments (e.g. from Linux to Windows)
- Reusable Can we save time and money on other projects by re-using the code?
- Readable Is the code easy for developers to read and understand?
- Testability Is it constructed in a way that the pieces can be easily and independently tested?
- Tested Are there automated tests that document how the code should work, and allow us to easily test whether it does?
There are certainly other ways to segment the “quality landscape”. This particular segmentation includes some overlapping categories.
Most failed software projects are not due to technical failures. They are usually due to “requirements failures.” I.e., building the wrong thing.
We believe engineers should be involved with the requirements process. You need to know why you are building what you are building what you are building.
Requirements indicate what is needed.
Specifications indicate how the requirements should be met.
It is important to distinguish between requirements and specifications. Clients often will give us specifications. When this happens, avoid prematurely implementing these specifications. Rather, work backward to understand the requirements. Once you understand the requirements, you often will realize that they can be met much more easily and simplify using another approach (i.e., a different set of specifications).
As engineers, we often get excited about writing code. When a client or manager tells us what to write, we start imagining how we would implement it, and how much fun it will be. While writing code is more fun than having hard conversations with stakeholders about requirements, doing so can save immense amounts of time and effort. In the end, our success as individuals, our success as a company, and our impact on the world will depend on our value added.
Once you know the requirements for a task, it is usually good to spend some time planning out what you are working on before jumping into the code. Here are some questions that are worth asking:
- How should I split this up into smaller parts?
- How will I test this?
- What parts of this are the most likely to change in the future, and how are they likely to change?
- How will my design accommodate these changes?
It is usually a good idea to write some tests first and to specify the API for the main functions and classes involved.
Setting up a development environment should be made as easy as possible. This is useful when you
- need to jump back into an old project
- need to set up an earlier version of a project to isolate a bug
- are blocked on an issue, and want to work on a second one
- have “it works for me but not for you” bugs
- are pulling upstream changes with new dependencies or data migrations
- need “backup” from other developers for a deadline or vacation.
Standard Script Names
These standard development scripts make it easy for developers to manage their project environments and start being productive quickly:
buildall— Build a fresh development environment. Should assume that system dependencies are present (e.g. Python, MySQL, Redis). Should NOT install system dependencies or isolation environments (e.g. python virtual environments). Should check system dependency versions. Should run within 5 minutes, and must run under 30 minutes. Should populate development databases with development data, if necessary.
build— Build things that are frequently rebuilt (e.g. typescript and SCSS). The
buildallscript should run this last, if it is present (some projects only need a
run— Start all of the development processes (e.g. the Django development server, Webpack hot module reloading, and the task runner).
test— Run the core set of tests. Should run within 5 minutes; must run within 10 minutes, move slow tests into
testall— Run all of the tests, even the slow ones.
clean— Clean all of the stuff that
cleanallscript should run this first, if it is present.
cleanall— Clean everything that
All of these commands except
run should be idempotent.
buildall script should create a minimal set of useful development data for new projects. This ensures developers don’t need to manually re-create data to test against.
The primary object of software architecture is to manage complexity.
There are several forms of documentation that go into a software project. Each form has its own place. Documentation is expensive to write and maintain. It should be kept as short as possible and should only be written if it is useful. It is usually bad to write documentation about libraries or frameworks since this is duplicated effort; instead, link to other resources here. Documentation should accomplish things which can not be accomplished better in the code.
Comments are especially prone to going stale. Avoid them when possible, and prefer cleaning up bad code overwriting comments to explain it.
It is usually bad to commit commented-out-code.
Developers are usually hesitant to delete comments. Don’t be. If a comment isn’t useful, delete it.
Use this form so that it is easy to grep:
TODO: Something I should address eventually.
The smaller and newer a project is, the less useful commit messages are. There is a balance to be had.
Even if your commit messages are short, write good commit messages:
- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalize the subject line
- Do not end the subject line with a period
- Use the imperative mood in the subject line
- Wrap the body at 72 characters
- Use the body to explain what and why vs. how
There can be too many tests, although there are usually too few. Tests serve as live documentation for a codebase and provide protection from many accidentally-introduced bugs.
Programmatic end-to-end tests are a common source of technical debt accumulation; they are expensive to write and age quickly. Written end-to-end tests are a good compromise. Write out a list of test cases in English with clear, specific instructions.
Unit tests should be small, isolated (and thus order-independent), and should be as small as possible.
Sometimes, manual testing is more appropriate than automated testing. If something should be manually tested, write out simple step-by-step instructions for the manual test when you write the code.
Frequent small refactors are usually better than large refactors and re-writes.
It is usually best to avoid refactoring working code because you risk introducing bugs and you make the git history more complicated. If you are working in a file for other reasons, then it is good to improve the code. This may involve:
- improving names
- simplifying function calls
- cleaning out dead code
- deleting useless or old comments
- adding more tests
Avoid the urge to throw away and re-write code that you have not written. The value of rewriting code decreases the older and more widely used it is (and the more likely many subtle bugs have been fixed already).
Usually, code should be reviewed by another Innolitics team member before being merged into the main branch.
Good code reviews take a lot of time for the reviewer, and while it may be frustrating to the author if there are delays, there are many upsides to reviews:
- Maintain code quality
- Ensure multiple people are familiar with all code
- An opportunity to praise strong craftsmanship
- A chance to learn from each other
Code reviews are often the last step in achieving a milestone or customer deliverable. Reviewers should give code reviews high priority when scheduling your time, and ideally will be completed within a few hours of being requested.
Bugs are inevitable, and they are as much the reviewer’s responsibility as the author’s.
When reviewing code, check for:
- Do they exit?
- Are they testing the right things?
- Are they the right types of tests?
- Does this affect any external documentation?
- Are appropriate comments updated?
- Are the git commit messages high quality?
- Release notes?
- Is the code easy to read?
- Could it be simplified further?
Typically code reviews should happen inside Pull Requests so that the communication can be kept around for future reference.
Pointers for reviewers:
- Ask questions instead of requesting changes
- Commend the author when they have well-written code
- Don’t make edits directly, even for simple changes; its best to let the author make changes.
Pointers for people being reviewed:
- Respond to every suggestion, if only with a thumbs-up; this lets the reviewer know someone read what they wrote and responded to it. It takes a lot of time to do a thorough review and it is discouraging if you don’t think the person you are reviewing even read what you wrote.
Whitespace matters. Be consistent.
Lines SHOULD be shorter than 100 characters and MUST be shorter than 120 characters.
We try to avoid prescribing too much about using particular technologies because conventions change quickly.
- Dependencies are a liability and should have a clear ROI
- Check the license
- Check the version number and the activity
- It is often good to discuss adding new dependencies with other people on the project
Use Python 3 when possible.
Code Complete, by Steve McConnell, is about code construction. It is out-dated in that it implicitly assumes a “waterfall” development strategy and it does not discuss testing enough, however, most of its suggestions are clear and relevant.
Clean Code, by Robert C. Martin, describes how to write clean readable code. Its suggestions overlap with Code Complete, but it is sufficiently different to warrant reading too. Some recommendations deal with Java-specific problems (e.g. the recommendation not to use arguments on class methods, and problems with argument ordering).