A word about software distribution methods

Ah, exams… I wish I were already done with them, but this year’s ones are quite important for my career as a physicist, and I can’t just come with my hands in my pockets and expect to get away with average results. Because of this, I must regularly report the OSdeving work.

Once I get more time, I’d like to write an analysis of the interrupt structure of PPC, Sparc, IA-64, and Alpha, if I can find some data on them. Also, anyone who can suggest me an updated paper about interrupts on ARM (the one I read for the last article dated back from 2001) would be much welcome.

But in meantime, here’s a condensed form of some longstanding design thoughts about how I’d like software to be distributed on this OS…

Qualities of the ideal software distribution method

In “software distribution”, I include every step which occurs between…

  • The instant where the developer, done with the coding work, has got its software and all its resource files packed in a single folder, a list of the external dependencies at hand, and wonders “what’s next ?”.
  • The instant where the user is done with installing the software, and only has to click or double-click on something to run it.

In an ideal world, a method for software distribution on personal computers should have the following properties :

  • Universality (Univ) : All software can be installed this way, including system/privileged one.
  • User-side simplicity (UserSimpl) : The effort to provide on the user’s side in order to install software is minimal.
  • Developer-side simplicity (DevSimpl) : The effort to provide on the developer’s side in order to distribute software is minimal.
  • Security (Secu) : The user doesn’t have to give full machine access to some form of untrusted script, let alone an untrusted binary. If the software installation process may have exploitable flaws, it is possible to patch them for all software packages at once.
  • OS Integration (Integ) : A shortcut to the software appears in the local variant of a “programs” menu, and the software can be launched from the command line on cli-centric OSs. File associations are made, programs which require it are put on startup, and the overall installation process respects a system-wide configuration. If update capabilities are present, they use a system-wide updater.
  • Transferability (Transf) : Software is packaged in a fashion that eases file transfers. It is packed in a single file, which size is reduced as much as possible to ease downloads. After installation, it is easy to re-package software in order to send it on another computer without having to keep the original installer at hand : files are not randomly spread on mass storage devices, in a range of folders with cryptic names, without any way to reverse the process.
  • Discoverability (Discov) : The user may easily learn about the existence of the software and install it, using no more than a vanilla OS installation.
  • Decentralization (Decentr) : The software distribution process does not solely rely on a single repository, with all the issues that such a monopoly brings in terms of arbitrary software selection, suspiciously high distribution fees for developers, fragility of the system in the event where the repository maintainers goes bankrupt, user blackmailing abilities…
  • Program/Data separation (ProgDataSep) : The files including user- and machine-specific data are cleanly separated from each other and both separated from the core software (binary and resource files), easing backups and privacy preservation. It’s a typical example of behavior whose systematic enforcement requires OS intervention.

Other criteria which I did not include would be DRMs (enforcing that software which shouldn’t be transferred can’t be transferred) and implementation complexity. Here’s why : all those methods have been successfully implemented multiple times, so implementation complexity shouldn’t be that much of a concern, especially considering that software distribution is a fundamental part of developer’s and user’s experience of the OS which should receive a lot of attention. About DRMs, well… I think that software should implement their own protection scheme when they think that they need it, first because such technology is ageing pretty quickly and second because if they ask an OS to make sure that files aren’t copied, they’re just doing it wrong. There’s not much that can be done by the OS but couldn’t be done by software itself…

Software distribution methods in the Internet age

Nowadays, a few software distribution methods have come to dominate the software distribution market. Here they are :

  • Zipped folder (Zip) : The developer simply takes the folder mentioned above and puts it in a compressed file.
  • Zipped folder, with dependencies (ZipWithDeps) : A variant of the above where the developer also includes every nonstandard external library needed by the software in the zipped folder, either through static linking or by putting them in the folder, to ensure that the user is not going to encounter problem if they are not installed on its computer.
  • Zipped standardized application folders (AppFolder) : A variant of the zipped folder concept above where said folder respects some OS-specific conventions, allowing the OS to recognize it as a folder containing software and treat it accordingly. As an example, on Mac OS X, folders whose name end with “.app” and which have a certain internal structure, called “Application Bundles” by Apple, are treated as a single file by the GUI, with a double-click on them resulting in opening the binary as if the folder itself was a binary. A standardized folder structure may also be used to specify that a program can by associated with some file types, must run on boot, requires security privileges beyond the standard user programs’ ones, has a splash screen that should be displayed while it’s being loaded, etc.
  • Installer : The developer distributes a big binary which is itself in charge of performing the installation of the software, typically using a statically linked ZipWithDeps and some rules.
  • Package : A variant of the installer system where the developer provides something that’s not a binary but rather a sort of script which is in turn read and executed by a standard system-wide installer program when opened. Typically offers the benefit of better system integration while keeping the flexibility of the installer system, as the script can run developer-provided binaries if needed. Examples of package formats include Windows Installer packages (.msi), Deb, and RPM.
  • Repository : Once a standard package format as described above exists, it is possible to envision putting lots of those standard packages on a server and having the user browse the gigantic library of available software using some standard OS software, selecting and downloading packages on it at will, keeping his packages updated as the repository gets updated too. Repository maintainers commonly take advantage of the “central” status of a repository to share libraries among packages (in order to avoid having to download them 30 times if they are used by 30 packages and to keep them up to date even for packages that are not updated anymore), sort packages in categories for easier browsing, etc.
  • Application stores (AppStores) : A variant of the repository system, currently specific to the mobile space, in which only one repository, provided by the OS manufacturer, is available, and it is impossible to use other repositories without some hacking.

Let’s see how well each goes as far as the criteria described above are concerned… (Note : I assume that the OS makes optimal use of the knowledge it has at hand about software packages, which is especially important in the case of application folders)

Method Univ UserSimp DevSimp Secu Integ
Zip Medium [1] Low [2][3] High High Low
ZipWithDeps Medium [1] Medium [2] High High Low
AppFolder High High Medium High High
Installer High Low [8] Medium Very Low [9] Medium [5]
Package High High [5][11] Medium Low [12] High [5][11]
Repository High Very High [13] Low [14] Medium [15] Very High [13]
AppStores High Very High [13] Low [14] Medium [15] Very High [13]
Method Transf Discov Decentr ProgDataSep
Zip Very High Low [4] High Medium [5]
ZipWithDeps High [6] Low [4] High Medium [5]
AppFolder High [6] Low [4] High High [7]
Installer Low [10] Low [4] High Medium [5]
Package Low [10] Low [4] High High [7]
Repository Medium [16] High Medium [17] High [7]
AppStores Medium [16] High Low High [7]

Interpretation of the ratings

  • High : The method performs well according to this criterion.
  • Medium : The method does not perform very well, but it does the job.
  • Low : The method is terrible according to this criterion.
  • Very High : The method performs exceptionally well according to this criterion, although “High” is already pretty good.
  • Very Low : The method is really awful from this point of view, although “Low” is already pretty bad.

Ratings explanations

  1. The software may only run with limited (“standard”) application privileges.
  2. The user has to look for the file in the folder hierarchy or create a shortcut himself.
  3. If external libraries are not present, the user must struggle with error messages and install said libraries on his own.
  4. Sure, software distributed this way can become more discoverable if it’s distributed on a popular website like download.com or receives some in-depth review, but nothing about this distribution method intrinsically helps the user to learn about the software.
  5. Depends on a developer’s good will…
  6. The inclusion of external libraries can make the package much heavier, even if it turns out that said library is already installed.
  7. The OS can prevent software from writing in the folder where it’s stored, and force it to cleanly separate code and user/machine-specific data.
  8. Inconsistent interfaces, endless button-pushing, attempts to lure the user into installing spyware… Proprietary installers are one of the worst things which ever happened to computing and should disappear.
  9. Giving arbitrary binaries admin rights simply because their icon looks like that of an installer and it has “setup” in its name ? Way to go as far as computer security is concerned…
  10. Once an installer/package has been used and deleted, there’s no simple and universal way to revert its operation and get the original file back.
  11. …but on well-done package systems, said developer has to explicitly bypass the normal operation of software to break this. In this case, it’s his fault, not the distribution method’s one.
  12. Packages still offer developers the option to run untrusted code with admin privileges granted by the user. On the other hand, flaws within the package manager itself can be patched in a global way, whereas installers with security flaws will keep them for the whole life of that release of the software.
  13. Repository maintainers may check that, and if it’s not good correct it/notify the developer/ban the package.
  14. The developer not only has to package his program in a specific way, but must also contact the repository maintainers and reach an agreement with them, which may also involve meeting some quality standards…
  15. Repository maintainers cannot magically detect malicious package installation procedures, that’s the main point of backdoors. What they can do is ban packages which are known to have caused harm once. Which is better than nothing, but still hurts for those who got burnt.
  16. Debatable. Reinstalling the package from the repository is only a matter of knowing its name, and good repositories make packages smaller by sharing libraries wisely among them. On the other hand, you don’t have a full transferability capability : if the repository goes down or bans your software as it considers it to be legacy, you’re still screwed.
  17. If your repository goes down, you may or may not be able to use another one, depending on how popular your OS is. But not all repositories are equal, and you’ll probably lose something in the process.

Which one would I use

Well, let’s have a look at what matters most to me and what I have at hand.

  • A consistent user experience is something which matters a lot to me, so I’d like to have only one universal and user-friendly software distribution method, which ensures that software is well-integrated within the OS.
  • I care a lot about security, so the less code runs as admin the better.
  • I may not stay here forever, and unless a miracle happens this OS project will probably remain a small one for its whole life. So I must care about what will happen if someday I disappear. In case this happens, users should not be reliant on me to get their packages, so decentralization is best. Experience of the internet also tells me that old packages often end up disappearing, so people should be able to easily transfer their software from one computer they own to another. All this will dramatically improve the life of this OS in the advent where its development should cease.

With all this in mind, the best option seems to be the “AppFolder” one. It is quite strong in the areas which matter to me, and does a fairly good job in other areas. It asks developer to package their apps a bit more carefully than by simply compressing them, which nowadays is a reasonable requirement, and can be done easily and quickly using modern tools. It also allows the OS and its user to keep a tight grip around what third-party code is doing, as an example by enforcing the separation of user-specific data and program files.

This approach has one big issue, though : in the age of repositories and app stores, getting developers to tolerate a system where users cannot simply learn about their software by having a look at a massive catalog is a significant issue. If this project gets some traction, I’ll probably have to get around that problem, as an example by offering a big software download website maintained by me, in the spirit of pbiDIR, BeBits, Download.com and Packman. This will make learning about new software relatively easy, without making repositories a core and vital part of software distribution on this OS.

What do you think about this ?

3 thoughts on “A word about software distribution methods

  1. José Pedro May 2, 2011 / 12:56 am

    What worries me with this system is the lack of a centralized update system. What I usually see in Windows is that until I realize a new version of a program is launched, I will not update it. The last time I did a major system cleanup, I had programs which were not updated for 3 years… that might be related to booting Windows once every 6 months, but nonetheless, it is a very possible scenario for less “techy” users and “sloppy” developers.

    Some sort of repository system, no matter how simple it was, would be good. The most simple idea I had was something as absurd as having a text file with a link to an Atom feed in the AppFolder’s root. Each entry would have the version, changes and download link to the latest “AppFolder.

    The system would scan each AppFolder for links to Atom feeds, download those feeds and check for changes. When a change was detected, the AppFolder would simply be replaced with the most recent version.

    This could also help implement simple dependencies by simply providing links to the other necessary “repository” links within the AppFolder.

  2. Hadrien May 2, 2011 / 7:57 am

    Yup. I totally forgot to mention it in this post, but I had something similar to what you mention in mind.

    The idea was to have a system-wide update system, but one which does not depend on a central repository. Each software distributor can provide a sort of mini-repository where new versions are provided in a standard way, typically mentioning the kind of changes made (security/bugfix/feature/compatibility breaking updates) and maybe providing a full changelog for those people who read them. Links to said repository (or lack thereof, if the dev chooses not to use one) are provided in AppFolder’s (1) descriptor files, the OS updater does the rest automagically according to user choices.

    I am not thinking about specific techs at this early design stage, but indeed an Atom feed might be useable for this purpose.

    (1) If I implement this, I must really find a better, easier on the tongue name. This one was just a trick to get a short word describing the concept.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s