Kernel Planet

January 15, 2018

Pete Zaitcev: New toy

Guess what.

A Russian pillowcase is much wider (or squar-er) than tubular American ones, so it works perfectly as a cover.

January 15, 2018 09:38 PM

January 12, 2018

Pete Zaitcev: Old news

Per U.S. News:

Alphabet Inc's (GOOG, GOOGL) Google said in 2016 that it was designing a server based on International Business Machines Corp's (IBM) Power9 processor.

Have they put anything into production since then? If not, why bring this up?

January 12, 2018 04:07 PM

January 11, 2018

Pete Zaitcev: A split home network

Real quick, why a 4-port router was needed.

  1. Red: Upstream link to ISP
  2. Grey: WiFi
  3. Blue: Entertainment stack
  4. Green: General Ethernet

The only reason to split the blue network is to prevent TiVo from attacking other boxes, such as desktops and printers. Yes, this is clearly not paranoid enough for a guy who insists on a dumb TV.

January 11, 2018 04:53 AM

January 10, 2018

Pete Zaitcev: Buying a dumb TV in 2018 America

I wanted to buy a TV a month ago and found that almost all of them are "Smart" nowadays. When I asked for a conventional TV, people ranging from a floor worker at Best Buy to Nikita Danilov at Facebook implied that I was an idiot. Still, I succeeded.

At first, I started looking at what is positioned as "conference room monitor". The NEC E506 is far away the leader, but it's expensive at $800 or so.

Then, I went to Fry's, who advertise quasi-brands like SILO. They had TVs on display, but were out. I was even desperate enough to be upsold to Athyme for $450, but they fortunately were out of that one too.

At that point, I headed to Best Buy, who have an exclusive agreement with Toshiba (h/t Matt Kern on Facebook). I was not happy to support this kind of distasteful arrangement, but very few options remained. There, it was either waiting for delivery, or driving 3 hours to a warehouse store. Considering how much my Jeep burns per mile, I declined.

Finally, I headed to a local Wal-Mart and bought a VISIO for $400 out the door. No fuss, no problem, easy peasy. Should've done that from the start.

P.S. Some people suggested buying a Smart TV and then not plugging it in. It includes not giving it the password for the house WiFi. Unfortunately, it is still problematic, as some of these TVs will associate with any open wireless network by default. An attacker drives by with a passwordless AP, and roots all TVs on the block. Unfortunately, I live an high-tech area where stuff like that happens all the time. When I mentioned it to Nikita, he thought that I was an idiot for sure. It's like a Russian joke about "dropping everything and moving to Uryupinsk."

January 10, 2018 06:29 PM

James Bottomley: GPL as the Best Licence – Governance and Philosophy

In the first part I discussed the balancing mechanisms the GPL provides for enabling corporate contributions, giving users a voice and the right one for mutually competing corporations to collaborate on equal terms.  In this part I’ll look at how the legal elements of the GPL licences make it pretty much the perfect one for supporting a community of developers co-operating with corporations and users.

As far as a summary of my talk goes, this series is complete.  However, I’ve been asked to add some elaboration on the legal structure of GPL+DCO contrasted to other CLAs and also include AGPL, so I’ll likely do some other one off posts in the Legal category about this.

Free Software vs Open Source

There have been many definitions of both of these.  Rather than review them, in the spirit of Humpty Dumpty, I’ll give you mine: Free Software, to me, means espousing a set of underlying beliefs about the code (for instance the four freedoms of the FSF).  While this isn’t problematic for many developers (code freedom, of course, is what enables developer driven communities) it is an anathema to most corporations and in particular their lawyers because, generally applied, it would require the release of all software based intellectual property.  Open Source on the other hand, to me, means that you follow all the rules of the project (usually licensing and contribution requirements) but don’t necessarily sign up to the philosophy underlying the project (if there is one; most Open Source projects won’t have one).

Open Source projects are compatible with Corporations because, provided they have some commonality in goals, even a corporation seeking to exploit a market can march a long way with a developer driven community before the goals diverge.  This period of marching together can be extremely beneficial for both the project and the corporation and if corporate priorities change, the corporation can simply stop contributing.  As I have stated before, Community Managers serve an essential purpose in keeping this goal alignment by making the necessary internal business adjustments within a corporation and by explaining the alignment externally.

The irony of the above is that collaborating within the framework of the project, as Open Source encourages, could work just as well for a Free Software project, provided the philosophical differences could be overcome (or overlooked).  In fact, one could go a stage further and theorize that the four freedoms as well as being input axioms to Free Software are, in fact, the generated end points of corporate pursuit of Open Source, so if the Open Source model wins in business, there won’t actually be a discernible difference between Open Source and Free Software.

Licences and Philosophy

It has often been said that the licence embodies the philosophy of the project (I’ve said it myself on more than one occasion, for which I’d now like to apologize).  However, it is an extremely reckless statement because it’s manifestly untrue in the case of GPL.  Neither v2 nor v3 does anything to require that adopters also espouse the four freedoms, although it could be said that the Tivoization Clause of v3, to which the kernel developers objected, goes slightly further down the road of trying to embed philosophy in the licence.  The reason for avoiding this statement is that it’s very easy for an inexperienced corporation (or pretty much any corporate legal counsel with lack of Open Source familiarity) to take this statement at face value and assume adopting the code or the licence will force some sort of viral adoption of a philosophy which is incompatible with their current business model and thus reject the use of reciprocal licences altogether.  Whenever any corporation debates using or contributing to Open Source, there’s inevitably an internal debate and this licence embeds philosophy argument is a powerful weapon for the Open Source opponents.

Equity in Contribution Models

Some licensing models, like those pioneered by Apache, are thought to require a foundation to pass the rights through under the licence: developers (or their corporations) sign a Contributor Licence Agreement (CLA) which basically grants the foundation redistributable licences to both copyrights and patents in the code and then the the Foundation licenses the contribution to the Project under Apache-2.  The net result is the outbound rights (what consumers of the project gets) are Apache-2 but the inbound rights (what contributors are required to give away) are considerably more.  The danger in this model is that control of the foundation gives control of the inbound rights, so who controls the foundation and how control may be transferred forms an important part of the analysis of what happens to contributor rights.  Note that this model is also the basis of open core, with a corporation taking the place of the foundation.

Inequity in the inbound versus the outbound rights creates an imbalance of power within the project between those who possess the inbound rights and everyone else (who only possess the outbound rights) and can damage developer driven communities by creating an alternate power structure (the one which controls the IP rights).  Further, the IP rights tend to be a focus for corporations, so simply joining the controlling entity (or taking a licence from it) instead of actually contributing to the project can become an end goal, thus weakening the technical contributions to the project and breaking the link with end users.

Creating equity in the licensing framework is thus a key to preserving the developer driven nature of a community.  This equity can be preserved by using the Inbound = Outbound principle, first pioneered by Richard Fontana, the essential element being that contributors should only give away exactly the rights that downstream recipients require under the licence.  This structure means there is no need for a formal CLA and instead a model like the Developer Certificate of Origin (DCO) can be used whereby the contributor simply places a statement in the source control of the project itself attesting to giving away exactly the rights required by the licence.  In this model, there’s no requirement to store non-electronic copies of the the contribution attestation (which inevitably seem to get lost), because the source control system used by the project does this.  Additionally, the source browsing functions of the source control system can trace a single line of code back exactly to all the contributor attestations thus allowing fully transparent inspection and independent verification of all the inbound contribution grants.

The Dangers of Foundations

Foundations which have no special inbound contribution rights can still present a threat to the project by becoming an alternate power structure.  In the worst case, the alternate power structure is cemented by the Foundation having a direct control link with the project, usually via some Technical Oversight Committee (TOC).  In this case, the natural Developer Driven nature of the project is sapped by the TOC creating uncertainty over whether a contribution should be accepted or not, so now the object isn’t to enthuse fellow developers, it’s to please the TOC.  The control confusion created by this type of foundation directly atrophies the project.

Even if a Foundation specifically doesn’t create any form of control link with the project, there’s still the danger that a corporation’s marketing department sees joining the Foundation as a way of linking itself with the project without having to engage the engineering department, and thus still causing a weakening in both potential contributions and the link between the project and its end users.

There are specific reasons why projects need foundations (anything requiring financial resources like conferences or grants requires some entity to hold the cash) but they should be driven by the need of the community for a service and not by the need of corporations for an entity.

GPL+DCO as the Perfect Licence and Contribution Framework

Reciprocity is the key to this: the requirement to give back the modifications levels the playing field for corporations by ensuring that they each see what the others are doing.  Since there’s little benefit (and often considerable down side) to hiding modifications and doing a dump at release time, it actively encourages collaboration between competitors on shared features.  Reciprocity also contains patent leakage as we saw in Part 1.  Coupled with the DCO using the Inbound = Outbound principle, means that the Licence and DCO process are everything you need to form an effective and equal community.

Equality enforced by licensing coupled with reciprocity also provides a level playing field for corporate contributors as we saw in part 1, so equality before the community ensures equity among all participants.  Since this is analogous to the equity principles that underlie a lot of the world’s legal systems, it should be no real surprise that it generates the best contribution framework for the project.  Best of all, the model works simply and effectively for a group of contributors without necessity for any more formal body.

Contributions and Commits

Although GPL+DCO can ensure equity in contribution, some human agency is still required to go from contribution to commit.  The application of this agency is one of the most important aspects to the vibrancy of the project and the community.  The agency can be exercised by an individual or a group; however, the composition of the agency doesn’t much matter, what does is that the commit decisions of the agency essentially (and impartially) judge the technical merit of the contribution in relation to the project.

A bad commit agency can be even more atrophying to a community than a Foundation because it directly saps the confidence the community has in the ability of good (or interesting) code to get into the tree.  Conversely, a good agency is simply required to make sound technical decisions about the contribution, which directly preserves the confidence of the community that good code gets into the tree.   As such, the role requires leadership, impartiality and sound judgment rather than any particular structure.

Governance and Enforcement

Governance seems to have many meanings depending on context, so lets narrow it to the rules by which the project is run (this necessarily includes gathering the IP contribution rights) and how they get followed. In a GPL+DCO framework, the only additional governance component required is the commit agency.

However, having rules isn’t sufficient unless you also follow them; in other words you need some sort of enforcement mechanism.  In a non-GPL+DCO system, this usually involves having an elaborate set of sanctions and some sort of adjudication system, which, if not set up correctly, can also be a source of inequity and project atrophy.  In a GPL+DCO system, most of the adjudication system and sanctions can be replaced by copyright law (this was the design of the licence, after all), which means licence enforcement (or at least the threat of it) becomes the enforcement mechanism.  The only aspect of governance this doesn’t cover is the commit agency.  However, with no other formal mechanisms to support its authority, the commit agency depends on the trust of the community to function and could easily be replaced by that community simply forking the tree and trusting a new commit agency.

The two essential corollaries of the above is that enforcement does serve an essential governance purpose in a GPL+DCO ecosystem and lack of a formal power structure keeps the commit agency honest because the community could replace it.

The final thing worth noting is that too many formal rules can also seriously weaken a project by encouraging infighting over rule interpretations, how exactly they should be followed and who did or did not dot the i’s and cross the t’s.  This makes the very lack of formality and lack of a formalised power structure which the GPL+DCO encourages a key strength of the model.


In the first part I concluded that the GPL fostered the best ecosystem between developers, corporations and users by virtue of the essential ecosystem fairness it engenders.  In this part I conclude that formal control structures are actually detrimental to a developer driven community and thus the best structural mechanism is pure GPL+DCO with no additional formality.  Finally I conclude that this lack of ecosystem control is no bar to strong governance, since that can be enforced by any contributor through the copyright mechanism, and the very lack of control is what keeps the commit agency correctly serving the community.

January 10, 2018 04:38 PM

January 08, 2018

Pete Zaitcev: Caches are like the government

From an anonymous author, a follow-up to the discussion about the cache etc.:

counterpoint 1: Itanium, which was EPIC like Elbrus, failed even with Intel behind it. And it added prefetching before the end. Source:

counterpoint 2: To get fast, Elbrus has also added at least one kind of prefetch (APB, "Array Prefetch Buffer") and has the multimegabyte cache that Zaitcev decries. Source: [kozhin2016, 10.1109/EnT.2016.027]

counterpoint 3: "According to Keith Diefendorff, in 1978 almost 15 years ahead of Western superscalar processors, Elbrus implemented a two-issue out-of-order processor with register renaming and speculative execution"

1. Itanium, as I recall, suffered from the poor initial implementation too much. Remember that 1st implementation was designed in Intel, while the 2nd implementation was designed at HP. Intel's chip stunk on ice. By the time HP came along, AMD64 became a thing, and then it was over.

Would Itanium win over the AMD64 if it were better established, burned less power, and were faster, sooner? There's no telling. The compatibility is an important consideration, and the binary translation was very shaky back then, unless you count Crusoe.

2. It's quite true that modern Elbrus runs with a large cache. That is because cache is obviously beneficial. All this is about is to consider once again if better software control of caches, and their better architecture in general, would disrupt side-channel signalling and bring performance advantages.

By the way, people might not remember it now, but a large chunk of Opteron's performance derived from its excellent memory controller. It's a component of CPU that tended not to get noticed, but it's essential. Fortunately, the Rowhammer vulnerability drew some much-needed attention to it, as well as a possible role for software control there.

3. Well, Prof. Babayan's own outlook at Elbrus-2 and its superscalar, out-of-order core was, "As you can see, I tried this first, and found that VLIW was better", which is why Elbrus-3 disposed with all that stuff. Naturally, all that stuff came back when we started to find the limits of EPIC (nee VLIW), just like the cache did.

January 08, 2018 10:51 PM

January 06, 2018

Greg Kroah-Hartman: Meltdown and Spectre Linux kernel status

By now, everyone knows that something “big” just got announced regarding computer security. Heck, when the Daily Mail does a report on it , you know something is bad…

Anyway, I’m not going to go into the details about the problems being reported, other than to point you at the wonderfully written Project Zero paper on the issues involved here. They should just give out the 2018 Pwnie award right now, it’s that amazingly good.

If you do want technical details for how we are resolving those issues in the kernel, see the always awesome writeup for the details.

Also, here’s a good summary of lots of other postings that includes announcements from various vendors.

As for how this was all handled by the companies involved, well this could be described as a textbook example of how NOT to interact with the Linux kernel community properly. The people and companies involved know what happened, and I’m sure it will all come out eventually, but right now we need to focus on fixing the issues involved, and not pointing blame, no matter how much we want to.

What you can do right now

If your Linux systems are running a normal Linux distribution, go update your kernel. They should all have the updates in them already. And then keep updating them over the next few weeks, we are still working out lots of corner case bugs given that the testing involved here is complex given the huge variety of systems and workloads this affects. If your distro does not have kernel updates, then I strongly suggest changing distros right now.

However there are lots of systems out there that are not running “normal” Linux distributions for various reasons (rumor has it that it is way more than the “traditional” corporate distros). They rely on the LTS kernel updates, or the normal stable kernel updates, or they are in-house franken-kernels. For those people here’s the status of what is going on regarding all of this mess in the upstream kernels you can use.

Meltdown – x86

Right now, Linus’s kernel tree contains all of the fixes we currently know about to handle the Meltdown vulnerability for the x86 architecture. Go enable the CONFIG_PAGE_TABLE_ISOLATION kernel build option, and rebuild and reboot and all should be fine.

However, Linus’s tree is currently at 4.15-rc6 + some outstanding patches. 4.15-rc7 should be out tomorrow, with those outstanding patches to resolve some issues, but most people do not run a -rc kernel in a “normal” environment.

Because of this, the x86 kernel developers have done a wonderful job in their development of the page table isolation code, so much so that the backport to the latest stable kernel, 4.14, has been almost trivial for me to do. This means that the latest 4.14 release (4.14.12 at this moment in time), is what you should be running. 4.14.13 will be out in a few more days, with some additional fixes in it that are needed for some systems that have boot-time problems with 4.14.12 (it’s an obvious problem, if it does not boot, just add the patches now queued up.)

I would personally like to thank Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra, Josh Poimboeuf, Juergen Gross, and Linus Torvalds for all of the work they have done in getting these fixes developed and merged upstream in a form that was so easy for me to consume to allow the stable releases to work properly. Without that effort, I don’t even want to think about what would have happened.

For the older long term stable (LTS) kernels, I have leaned heavily on the wonderful work of Huge Dickens, Dave Hansen, Jiri Kosina and Borislav Petkov to bring the same functionality to the 4.4 and 4.9 stable kernel trees. I had also had immense help from Guenter Roeck, Kees Cook, Jamie Iles, and many others in tracking down nasty bugs and missing patches. I want to also call out David Woodhouse, Eduardo Valentin, Laura Abbott, and Rik van Riel for their help with the backporting and integration as well, their help was essential in numerous tricky places.

These LTS kernels also have the CONFIG_PAGE_TABLE_ISOLATION build option that should be enabled to get complete protection.

As this backport is very different from the mainline version that is in 4.14 and 4.15, there are different bugs happening, right now we know of some VDSO issues that are getting worked on, and some odd virtual machine setups are reporting strange errors, but those are the minority at the moment, and should not stop you from upgrading at all right now. If you do run into problems with these releases, please let us know on the stable kernel mailing list.

If you rely on any other kernel tree other than 4.4, 4.9, or 4.14 right now, and you do not have a distribution supporting you, you are out of luck. The lack of patches to resolve the Meltdown problem is so minor compared to the hundreds of other known exploits and bugs that your kernel version currently contains. You need to worry about that more than anything else at this moment, and get your systems up to date first.

Also, go yell at the people who forced you to run an obsoleted and insecure kernel version, they are the ones that need to learn that doing so is a totally reckless act.

Meltdown – ARM64

Right now the ARM64 set of patches for the Meltdown issue are not merged into Linus’s tree. They are staged and ready to be merged into 4.16-rc1 once 4.15 is released in a few weeks. Because these patches are not in a released kernel from Linus yet, I can not backport them into the stable kernel releases (hey, we have rules for a reason…)

Due to them not being in a released kernel, if you rely on ARM64 for your systems (i.e. Android), I point you at the Android Common Kernel tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and 4.9 branches as of this point in time.

I would strongly recommend just tracking those branches as more fixes get added over time due to testing and things catch up with what gets merged into the upstream kernel releases over time, especially as I do not know when these patches will land in the stable and LTS kernel releases at this point in time.

For the 4.4 and 4.9 LTS kernels, odds are these patches will never get merged into them, due to the large number of prerequisite patches required. All of those prerequisite patches have been long merged and tested in the android-common kernels, so I think it is a better idea to just rely on those kernel branches instead of the LTS release for ARM systems at this point in time.

Also note, I merge all of the LTS kernel updates into those branches usually within a day or so of being released, so you should be following those branches no matter what, to ensure your ARM systems are up to date and secure.


Now things get “interesting”…

Again, if you are running a distro kernel, you might be covered as some of the distros have merged various patches into them that they claim mitigate most of the problems here. I suggest updating and testing for yourself to see if you are worried about this attack vector

For upstream, well, the status is there is no fixes merged into any upstream tree for these types of issues yet. There are numerous patches floating around on the different mailing lists that are proposing solutions for how to resolve them, but they are under heavy development, some of the patch series do not even build or apply to any known trees, the series conflict with each other, and it’s a general mess.

This is due to the fact that the Spectre issues were the last to be addressed by the kernel developers. All of us were working on the Meltdown issue, and we had no real information on exactly what the Spectre problem was at all, and what patches were floating around were in even worse shape than what have been publicly posted.

Because of all of this, it is going to take us in the kernel community a few weeks to resolve these issues and get them merged upstream. The fixes are coming in to various subsystems all over the kernel, and will be collected and released in the stable kernel updates as they are merged, so again, you are best off just staying up to date with either your distribution’s kernel releases, or the LTS and stable kernel releases.

It’s not the best news, I know, but it’s reality. If it’s any consolation, it does not seem that any other operating system has full solutions for these issues either, the whole industry is in the same boat right now, and we just need to wait and let the developers solve the problem as quickly as they can.

The proposed solutions are not trivial, but some of them are amazingly good. The Retpoline post from Paul Turner is an example of some of the new concepts being created to help resolve these issues. This is going to be an area of lots of research over the next years to come up with ways to mitigate the potential problems involved in hardware that wants to try to predict the future before it happens.

Other arches

Right now, I have not seen patches for any other architectures than x86 and arm64. There are rumors of patches floating around in some of the enterprise distributions for some of the other processor types, and hopefully they will surface in the weeks to come to get merged properly upstream. I have no idea when that will happen, if you are dependant on a specific architecture, I suggest asking on the arch-specific mailing list about this to get a straight answer.


Again, update your kernels, don’t delay, and don’t stop. The updates to resolve these problems will be continuing to come for a long period of time. Also, there are still lots of other bugs and security issues being resolved in the stable and LTS kernel releases that are totally independent of these types of issues, so keeping up to date is always a good idea.

Right now, there are a lot of very overworked, grumpy, sleepless, and just generally pissed off kernel developers working as hard as they can to resolve these issues that they themselves did not cause at all. Please be considerate of their situation right now. They need all the love and support and free supply of their favorite beverage that we can provide them to ensure that we all end up with fixed systems as soon as possible.

January 06, 2018 12:36 PM

January 05, 2018

Pete Zaitcev: Police action in the drone-to-helicopter collision

The year 2017 was the first year when a civilian multicopter drone collided with a manned aircraft. It was expected for a while and there were several false starts. One thing is curious though - how did they find the operator of the drone? I presume it wasn't something simple like a post on Facebook with a video of the collision. They must've polled witnesses in the area, then looked at surveilance cameras or whatnot, to get it narrowed to vehicles.

UPDATE: Readers mkevac and veelmore inform that a serialized part of the drone was recovered, and the investigators worked through seller records to identify the buyer.

January 05, 2018 11:06 PM

Pete Zaitcev: Prof. Babayan's Revenge

Someone at GNUsocial posted:

I suspect people trying to find alternate CPU architectures that don't suffer from #Spectre - like bugs have misunderstood how fundamental the problem is.Your CPU will not go fast without caches. Your CPU will not go fast without speculative execution. Solving the problem will require more silicon, not less. I don't think the market will accept the performance hit implied by simpler architectures. OS, compiler and VM (including the browser) workarounds are the way this will get mitigated.

CPUs will not go fast without caches and speculative execution, you say? Prof. Babayan may have something to say about that. Back when I worked under him in the 1990s, he considered caches a primitive workaround.

The work on Narch was informed by the observation that the submicron feature size provided designers with more silicon they knew what to do with. So, the task of a CPU designer was to identify ways to use massive amounts of gates productively. But instead, mediocre designers simply added more cache, even multi-level cache.

Talking about it was not enough, so he set out to design and implement his CPU, called "Narch" (later commercialized as "Elbrus-2000"). And he did. The performance was generally on par with its contemporaries, such as Pentium III and UltraSparc. It had a cache, but measured in kilobytes, not megabytes. But there were problems beyond the cache.

The second part of the Bee Yarn Knee's objection deals with the speculative execution. Knocking that out required a software known as a binary translator, which did basically the same thing, only in software[*]. Frankly at this point I cannot guarantee that it weren't possible to abuse that mechanism for unintentional signaling in the same ways Meltdown works. You don't have cache for timing signals in Narch, but you do have the translator, which can be timed if it runs at run time like in Transmeta Crusoe. In Narch's case it only ran ahead of time, so not exploitable, but the result turned out to be not fast enough for workloads that make a good use of speculative execution today (such as LISP and gcc).

Still, I think that a blanket objection that CPU cannot run fast with no cache and no speculative execution, IMHO, is informed by ignorance of alternatives. I cannot guarantee that E2k would solve the problem for good, after all its later models sit on top of a cache. But at least we have a hint.

[*] The translator grew from a language toolchain and could be used in creative ways to translate source. It would not be binary in such case. I omit a lot of detail here.

UPDATE: Oh, boy:

But the speedup from speculative execution IS from parallelism. We're just asking the CPU to find it instead of the compiler. So couldn't you move the smarts into the compiler?

Sean, this is literally what they said 30 years ago.

January 05, 2018 04:56 PM

January 04, 2018

Kees Cook: SMEP emulation in PTI

An nice additional benefit of the recent Kernel Page Table Isolation (CONFIG_PAGE_TABLE_ISOLATION) patches (to defend against CVE-2017-5754, the speculative execution “rogue data cache load” or “Meltdown” flaw) is that the userspace page tables visible while running in kernel mode lack the executable bit. As a result, systems without the SMEP CPU feature (before Ivy-Bridge) get it emulated for “free”.

Here’s a non-SMEP system with PTI disabled (booted with “pti=off“), running the EXEC_USERSPACE LKDTM test:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: disabled on command line.
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
# dmesg
[   17.883754] lkdtm: Performing direct entry EXEC_USERSPACE
[   17.885149] lkdtm: attempting ok execution at ffffffff9f6293a0
[   17.886350] lkdtm: attempting bad execution at 00007f6a2f84d000

No crash! The kernel was happily executing userspace memory.

But with PTI enabled:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: enabled
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
# dmesg
[   33.657695] lkdtm: Performing direct entry EXEC_USERSPACE
[   33.658800] lkdtm: attempting ok execution at ffffffff926293a0
[   33.660110] lkdtm: attempting bad execution at 00007f7c64546000
[   33.661301] BUG: unable to handle kernel paging request at 00007f7c64546000
[   33.662554] IP: 0x7f7c64546000

It should only take a little more work to leave the userspace page tables entirely unmapped while in kernel mode, and only map them in during copy_to_user()/copy_from_user() as ARM already does with ARM64_SW_TTBR0_PAN (or CONFIG_CPU_SW_DOMAIN_PAN on arm32).

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

January 04, 2018 09:43 PM

Pete Zaitcev: More bugs

Speaking of stupid bugs that make no sense to report, Anaconda fails immediately in F27 if one of the disks has an exported volume group on it — in case you thought it was a clever way to protect some data from being overwritten accidentally by the installation. The workaround was to unplug the drives that contained the PVs in the problematic VG. Now, why not to report this? It's 100% reproducible. But reporting presumes a responsibility to re-test, and I'm not going to install a fresh Fedora in a long time again, hopefully, so I'm not in a position to discharge my bug reporter's responsibilities.

January 04, 2018 08:27 PM

January 03, 2018

James Bottomley: GPL as the best licence – Community, Code and Licensing

This article is the first of  a set supporting the conclusion that the GPL family of copy left licences are the best ones for maintaining a healthy development pace while providing a framework for corporations and users to influence the code base.  It is based on an expansion of the thoughts behind the presentation GPL: The Best Business Licence for Corporate Code at the Compliance Summit 2017 in Yokohama.

A Community of Developers

The standard definition of any group of people building some form of open source software is usually that they’re developers (people with the necessary technical skills to create or contribute to the project).  In pretty much every developer driven community, they’re doing it because they get something out of the project itself (this is the scratch your own itch idea in the Cathedral and the Bazaar): usually because they use the project in some form, but sometimes because they’re fascinated by the ideas it embodies (this latter is really how the Linux Kernel got started because ordinarily a kernel on its own isn’t particularly useful but, for a lot of the developers, the ideas that went into creating unix were enormously fascinating and implementations were completely inaccessible in Europe thanks to the USL vs BSDi lawsuit).

The reason for discussing developer driven communities is very simple: they’re the predominant type of community in open source (Think Linux Kernel, Gnome, KDE etc) which implies that they’re the natural type of community that forms around shared code collaboration.  In this model of interaction, community and code are interlinked: Caring for the code means you also care for the community.  The health of this type of developer community is very easily checked: ask how many contributors would still contribute to the project if they weren’t paid to do it (reduction in patch volume doesn’t matter, just the desire to continue sending patches).  If fewer than 50% of the core contributors would cease contributing if they weren’t paid then the community is unhealthy.

Developer driven communities suffer from three specific drawbacks:

  1. They’re fractious: people who care about stuff tend to spend a lot of time arguing about it.  Usually some form of self organising leadership fixes a significant part of this, but it’s not guaranteed.
  2. Since the code is built by developers for developers (which is why they care about it) there’s no room for users who aren’t also developers in this model.
  3. The community is informal so there’s no organisation for corporations to have a peer relationship with, plus developers don’t tend to trust corporate motives anyway making it very difficult for corporations to join the community.

Trusting Corporations and Involving Users

Developer communities often distrust the motives of corporations because they think corporations don’t care about the code in the same way as developers do.  This is actually completely true: developers care about code for its own sake but corporations care about code only as far as it furthers their business interests.  However, this business interest motivation does provide the basis for trust within the community: as long as the developer community can see and understand the business motivation, they can trust the Corporation to do the right thing; within limits, of course, for instance code quality requirements of developers often conflict with time to release requirements for market opportunity.  This shared interest in the code base becomes the framework for limited trust.

Enter the community manager:  A community manager’s job, when executed properly, is twofold: one is to take corporate business plans and realign them so that some of the corporate goals align with those of useful open source communities and the second is to explain this goal alignment to the relevant communities.  This means that a good community manager never touts corporate “community credentials” but instead explains in terms developers can understand the business reasons why the community and the corporation should work together.  Once the goals are visible and aligned, the developer community will usually welcome the addition of paid corporate developers to work on the code.  Paying for contributions is the most effective path for Corporations to exert significant influence on the community and assurance of goal alignment is how the community understands how this influence benefits the community.

Involving users is another benefit corporations can add to the developer ecosystem.  Users who aren’t developers don’t have the technical skills necessary to make their voices and opinions heard within the developer driven community but corporations, which usually have paying users in some form consuming the packaged code, can respond to user input and could act as a proxy between the user base and the developer community.  For some corporations responding to user feedback which enhances uptake of the product is a natural business goal.  For others, it could be a goal the community manager pushes for within the corporation as a useful goal that would help business and which could be aligned with the developer community.  In either case, as long as the motives and goals are clearly understood, the corporation can exert influence in the community directly on behalf of users.

Corporate Fear around Community Code

All corporations have a significant worry about investing in something which they don’t control. However, these worries become definite fears around community code because not only might it take a significant investment to exert the needed influence, there’s also the possibility that the investment might enable a competitor to secure market advantage.

Another big potential fear is loss of intellectual property in the form of patent grants.  Specifically, permissive licences with patent grants allow any other entity to take the code on which the patent reads, incorporate it into a proprietary code base and then claim the benefit of the patent grant under the licence.  This problem, essentially, means that, unless it doesn’t care about IP leakage (or the benefit gained outweighs the problem caused), no corporation should contribute code to which they own patents under a permissive licence with a patent grant.

Both of these fears are significant drivers of “privatisation”, the behaviour whereby a corporation takes community code but does all of its enhancements and modifications in secret and never contributes them back to the community, under the assumption that bearing the forking cost of doing this as less onerous than the problems above.

GPL is the Key to Allaying these Fears

The IP leak fear is easily allayed: whether the version of GPL that includes an explicit or implicit patent licence, the IP can only leak as far as the code can go and the code cannot be included in a proprietary product because of the reciprocal code release requirements, thus the Corporation always has visibility into how far the IP rights might leak by following licence mandated code releases.

GPL cannot entirely allay the fear of being out competed with your own code but it can, at least, ensure that if a competitor is using a modification of your code, you know about it (as do your competition), so everyone has a level playing field.  Most customers tend to prefer active participants in open code bases, so to be competitive in the market places, corporations using the same code base tend to be trying to contribute actively.  The reciprocal requirements of GPL provide assurance that no-one can go to market with a secret modification of the code base that they haven’t shared with others.  Therefore, although corporations would prefer dominance and control, they’re prepared to settle for a fully level playing field, which the GPL provides.

Finally, from the community’s point of view, reciprocal licences prevent code privatisation (you can still work from a fork, but you must publish it) and thus encourage code sharing which tends to be a key community requirement.


In this first part, I conclude that the GPL, by ensuring fairness between mutually distrustful contributors and stemming IP leaks, can act as a guarantor of a workable code ecosystem for both developers and corporations and, by using the natural desire of corporations to appeal to customers, can use corporations to bridge the gap between user requirements and the developer community.

In the second part of this series, I’ll look at philosophy and governance and why GPL creates self regulating ecosystems which give corporations and users a useful function while not constraining the natural desire of developers to contribute and contrast this with other possible ecosystem models.

January 03, 2018 11:43 PM

January 02, 2018

Pete Zaitcev: The gdm spamming logs in F27, RHbz#1322588

Speaking of the futility of reporting bugs, check out the 1322588. Basically, gdm tries to adjust the screen brightness when a user is already logged in on that screen (fortunately, it fails). Fedora users report the bug, the maintainer asks them to report it upstream. They report it upstream. The upstream tinkers with something tangentially related, closes the bug. Maintainer closes the bug in Fedora. The issue is not fixed, users re-open the bug and the process continues. It was going on for coming up to 2 years now. I don't know why the GNOME upstream cannot program gdm not to screw with the screen after the very same gdm has logged a user in. It's beyond stupid, and I don't know what can be done. I can buy a Mac, I suppose.


-- Comment #71 from Cédric Bellegarde
Simple workaround:
- Disable auto brightness in your gnome session
- Logout and stop gdm
- Copy ~/.config/dconf/user to /var/lib/gdm/.config/dconf/user

UPDATE 2018-01-10:

-- Comment #75 from Bastien Nocera
Maybe you can do something there, instead of posting passive aggressive blog entries.

Back when Bastien maintained xine, we enjoyed a cordial working relationship, but I guess that does not count for anything.

January 02, 2018 11:36 PM

Pete Zaitcev: No more VLAN in the home network

Thanks to Fedora dropping the 32-bit x86 (i686) in F27, I had no choice but to upgrade the home router. I used this opportunity to get rid of VLANs and return to a conventional setup with 4 Eithernet ports. The main reason is, VLANs were not entirely stable in Fedora. Yes, they mostly worked, but I could never be sure that they would continue to work. Also, mostly in this context means, for example, that some time around F24 the boot-up process started hanging on the "Starting the LSB Networking" job for about a minute. It never was worth the trouble raising any bugs or tickets with upstreams, I never was able to resolve a single one of them. Not in Zebra, not in radvd, not in NetworkManager. Besides, if something is broken, I need a solution right now, not when developers turn it around. I suppose VLANs could be allright if I stuck to initscripts, but I needed NetworkManager to interact properly with the upstream ISP at some point. So, whatever. Fedora costed me $150 for the router and killed my VLAN setup.

I looked at ARM routers, but there was nothing. Or, nothing affordable that was SBSA and RHEL compatible. Sorry, ARM, you're still immature. Give me a call when you grow up.

Buying from Chinese was a mostly typical experience. They try to do good, but... Look at the questions about the console pinout at Amazon. The official answer is, "Hello,the pinouts is 232." Yes, really. When I tried to contact them by e-mail, they sent me a bunch of pictures that included pinouts for Ethernet RJ-45, pinout for motherboard header, and a photograph of a Cisco console cable. No, they don't use Cisco pinout. Instead, they use DB9 pin numbers on RJ-45 (obviously, pin 9 is not connected). It was easy to figure out using a multimeter, but I thought I'd ask properly first. The result was very stereotypical.

P.S. The bright green light is blink(1), a Christmas present from my daughter. I'm not yet using it to its full potential. The problem is, if it only shows a static light, it cannot indicate if the router hangs or fails to boot. It needs some kind of daemon job that constantly changes it.

P.P.S. The SG200 is probably going into the On-Q closet, where it may actually come useful.

P.P.P.S. There's a PoE injector under the white cable loop somewhere. It powers a standalone Cisco AP, a 1040 model.

January 02, 2018 11:08 PM

January 01, 2018

Paul E. Mc Kenney: 2017 Year-End Advice

One of the occupational hazard of being an old man is the urge to provide unsolicited advice on any number of topics. This time, the topic is weight lifting.

Some years ago, I decided to start lifting weights. My body no longer tolerated running, so I had long since substituted various low-impact mechanical means of aerobic exercise. But there was growing evidence that higher muscle mass is a good thing as one ages, so I figured I should give it a try. This posting lists a couple of my mistakes, which could enable you to avoid them, which in turn could enable you to make brand-spanking new mistakes of your very own design!

The first mistake resulted in sporadic pains in my left palm and wrist, which appeared after many months of upper-body weight workouts. In my experience, at my age, any mention of this sort of thing to medical professionals will result in a tentative diagnosis of arthritis, with the only prescription being continued observation. This experience motivated me to do a bit of self-debugging beforehand, which led me to notice that the pain was only in my left wrist and only in the center of my left palm. This focused my attention on my two middle fingers, especially the one on which I have been wearing a wedding ring pretty much non-stop since late 1985. (Of course, those prone to making a certain impolite hand gesture might have reason to suspect their middle finger.)

So I tried removing my wedding ring. I was unable to do so, even after soaking my hand for some minutes in a bath of water, soap, and ice. This situation seemed like a very bad thing, regardless of what might be causing the pain. I therefore consulted my wife, who suggested a particular jewelry store. Shortly thereafter, I was sitting in a chair while a gentleman used a tiny but effective hand-cranked circular saw to cut through the ring and a couple pairs of pliers to open it up. The gentleman was surprised that it took more than ten turns of the saw to cut through the ring, in contrast to the usual three turns. Apparently wearing a ring for more than 30 years can cause it to work harden.

The next step was for me to go without a ring for a few weeks to allow my finger to decide what size it wanted to be, now that it had a choice. They gave me back the cut-open ring, which I carried in my pocket. Coincidence or not, during that time, the pains in my wrists and palms vanished. Later, jewelry store resized the ring.

I now remove my ring every night. If you take up any sort of weight lifting involving use of your hands, I recommend that you also remove any rings you might wear, just to verify that you still can.

My second mistake was to embark upon a haphazard weight-lifting regime. I felt that this was OK because I wasn't training for anything other than advanced age, so that any imbalances should be fairly easily addressed.

My body had other ideas, especially in connection with the bout of allergy/asthma/sinitus/brochitis/whatever that I have (knock on wood) mostly recovered from. This condition of course results in coughing, in which the muscles surrounding your chest work together to push air out of your lungs as abruptly and quickly as humanly possible. (Interestingly enough, the maximum velocity of cough-driven air seems to be subject to great dispute, perhaps because it is highly variable and because there are so many different places you could measure it.)

The maximum-effort nature of a cough is just fine if your various chest muscles are reasonably evenly matched. Unfortunately, I had not concerned myself with the effects of my weight-lifting regime on my ability to cough, so I learned the hard way that the weaker muscles might object to this treatment, and make their objections known by going into spasms. Spasms involving one's back can be surprisingly difficult to pin down, but for me, otherwise nonsensical shooting pains involving the neck and head are often due to something in my back. I started some simple and gentle back exercises, and also indulged in Warner Brothers therapy, which involves sitting in an easy chair watching Warner Brothers cartoons, assisted by a heating pad lent by my wife.

In summary, if you are starting weight training, (1) take an organized approach and (2) remove any rings you are wearing at least once a week.

Other than that, have a very happy new year!!!

January 01, 2018 02:29 AM

December 29, 2017

Dave Airlie (blogspot): radv and vega conformance test status

We've been passing the vulkan conformance test suite 1.0.2 mustpass list on radv for quite a while now on the CIK/VI/Polaris cards. However Vega hadn't achieved the same pass rate.

With a bunch of fixes I pushed this morning and one fix for all GPUs, we now have the same pass rate on all GPUs and 0 fails.

This means Vega on radv can now be submitted for conformance under Vulkan 1.0,  not sure when I'll get time to do the paperwork, maybe early next year sometime.

December 29, 2017 03:24 AM

December 26, 2017

Pavel Machek: PostmarketOS and digital cameras

I did some talking in 2017. If you want to learn about postmarketOS (in Czech), there's recording at . ELCE talk about status of phone cameras is at .

December 26, 2017 10:59 AM

December 23, 2017

Paul E. Mc Kenney: Book review: "Engineering Reminiscences" and "Tears of Stone"

I believe that Charles T. Porter's “Engineering Reminiscences“ was a gift from my grandfather, who was himself a machinist. Porter's most prominent contribution was the high-speed steam engine, that is to say, a steam engine operating at more than about 100 RPM. Although steam engines and their governors proved to be somewhat of a dead end, some of his dynamic balancing techniques are still in use.

Technology changes, people and organizations not so much. Chapter XVII starting on page 189 describes a demonstration of two of his new high-speed steam engines (on operating at 150 RPM the other at 300 RPM) along with one of his colleague's new boilers at the 1870 Fair of the American Institute in New York. The boiler ran slanted water tubes through the firebox to more efficiently separate steam from the remaining water. The engines were small by 1870s standards, one having 16-inch diameter cylinders with a 30-inch stroke and the other having 6-inch diameter cylinders with a 12-inch stroke.

Other exhibitors also had boilers and steam engines, and yet other exhibitors had equipment driven by steam engines. All the boilers and steam engines where connected, but given that steam engines were, then as now, considered to be way cooler than mere boilers, it should not be too surprising that the boilers could not produce enough steam to keep all the engines running. In fact, by the end of the day, the steam pressure had dropped by half, resulting in great consternation and annoyance all around. The finger of suspicion quickly pointed at Porter's two high-speed steam engines—after all, great speed clearly must imply equally great consumption of steam, right?

Porter had anticipated this situation, and had therefore installed a shutoff valve that isolated the boiler and his two high-speed steam engines from the rest of the Fair's equipment. Porter therefore closed his valve, with the result that the steam pressure within his little steam network immediately rose to 70 PSI and the pressure to the rest of the network dropped to 25 PSI. In fact, the boiler generated excess steam even at 70 PSI, so that the fireman had to leave the firebox door slightly open to artificially lower the boiler temperature.

The steam pressure to the rest of the fair continued to decrease until it was but 15 PSI. Over the noon hour, an additional boiler was installed, which brought the pressure up to 70 PSI. Restarting the steam engines of course reduced the pressure, but at 5PM it was still 25 PSI.

The superintendent of the machinery department had repeatedly asked Porter to reopen the valve, but each time Porter had refused. At 5PM, the superintendent made it clear that his request was now a demand, and that if Porter would not open the valve, the superintendent would open it for him. Porter finally agreed to open the valve, but only on the condition that the other managers of the institute verify that the boiler was in fact generating more than enough steam for both engines. These managers were summoned forthwith, and they agreed that the boiler had been producing most of the show's steam and that the pair of high-speed steam engines had been consuming very little. Porter opened the valve, and there was no further trouble with low-pressure steam.

It is all too easy to imagine a roughly similar story unfolding in today's world. ;–)

Porter went on to develop steam engines capable of running well in excess of 1,000 RPM, with one key challenge being convincing onlookers that the motion-blurred engine really was running that fast.

Interestingly enough, steam engines were Porter's third career. He was a lawyer for several years, but became disgusted with legal practice. At about that same time, he became quite interested in the problem of facing stone, that is, producing a machine that would take a rough-cut stone and give it a smooth planar face (smooth by the standards of the mid-1800s, anyway). After a couple of years of experimentation, he produced a steam-powered machine that efficiently faced stone. Unfortunately, at about that same time, others realized that saws could even more efficiently face stone, so his invention was what we might now call a technical success and a business failure.

Oddly enough, we have recently learned that the application of saws to stone was not an invention of the mid-1800s, but rather a re-invention of a technique used heavily in the ancient Roman Empire, and suspected of having been used as early as the 13th century BC. This is one of many interesting nuggets on life in the Roman Empire brought out by the historical novel “Tears of Stone” by Vannoy and Zeiglar. This novel is informed by Zeigler's application of Cold War remote-sensing technology to interesting areas of the Italian landscape, a fact that I had the privilege of learning directly from Zeigler himself.

On the other hand, perhaps Porter's ghost can console himself with the fact that the earliest stone saws were hand-powered, and those of the Roman Empire were water powered. Porter's stone-facing machine was instead powered by modern steam engines. Yes, the ancient Egyptians also made some use of steam power, but as far as we know they never applied it industrially, and never via a reciprocating engine driving a rotary shaft. And yes, all of the qualifiers in the preceding sentence are necessary.

As we learn more about ancient civilizations, it will be interesting to see what other “modern inventions” turn out to have deep roots in ancient times!

December 23, 2017 10:29 PM

Paul E. Mc Kenney: Book review: "Make Trouble"

This book, by John Waters of “Hairspray” fame, was an impulse purchase. After all, who could fail to notice a small pink book with large white textured letters saying “Make Trouble”? It is a transcription of Waters's commencement address to the Rhode Institute School of Design's Class of 2015. Those who have known me over several decades might be surprised by this purchase, but what old man could resist a book whose flyleaf states “Anyone embarking on a creative path, he tells us, would do well to realize that pragmatism and discipline are as important as talent and that rejection is nothing to fear.”

They might be even more surprised that I agree with much of his advice. For but three examples:

  1. “A career in the arts is like a hitchhiking trip: All you need is one person to say ‘get in,’ and off you go.” Not really any different from my advising people to use the “high-school boy” algorithm when submitting papers and proposals.
  2. “Keep up with what's causing chaos in your field.” Not really any different from my “Go where there is trouble!”
  3. “Listen to your political enemies, particularly the smart ones”. Me, I would omit the word “political”, but close enough.
The book is mostly pictures, so if you are short of money, you do have the option of just reading it in the bookstore. See, I am making trouble already! ;–)

December 23, 2017 08:46 PM

December 21, 2017

Matthew Garrett: When should behaviour outside a community have consequences inside it?

Free software communities don't exist in a vacuum. They're made up of people who are also members of other communities, people who have other interests and engage in other activities. Sometimes these people engage in behaviour outside the community that may be perceived as negatively impacting communities that they're a part of, but most communities have no guidelines for determining whether behaviour outside the community should have any consequences within the community. This post isn't an attempt to provide those guidelines, but aims to provide some things that community leaders should think about when the issue is raised.

Some things to consider

Did the behaviour violate the law?

This seems like an obvious bar, but it turns out to be a pretty bad one. For a start, many things that are common accepted behaviour in various communities may be illegal (eg, reverse engineering work may contravene a strict reading of US copyright law), and taking this to an extreme would result in expelling anyone who's ever broken a speed limit. On the flipside, refusing to act unless someone broke the law is also a bad threshold - much behaviour that communities consider unacceptable may be entirely legal.

There's also the problem of determining whether a law was actually broken. The criminal justice system is (correctly) biased to an extent in favour of the defendant - removing someone's rights in society should require meeting a high burden of proof. However, this is not the threshold that most communities hold themselves to in determining whether to continue permitting an individual to associate with them. An incident that does not result in a finding of criminal guilt (either through an explicit finding or a failure to prosecute the case in the first place) should not be ignored by communities for that reason.

Did the behaviour violate your community norms?

There's plenty of behaviour that may be acceptable within other segments of society but unacceptable within your community (eg, lobbying for the use of proprietary software is considered entirely reasonable in most places, but rather less so at an FSF event). If someone can be trusted to segregate their behaviour appropriately then this may not be a problem, but that's probably not sufficient in all cases. For instance, if someone acts entirely reasonably within your community but engages in lengthy anti-semitic screeds on 4chan, it's legitimate to question whether permitting them to continue being part of your community serves your community's best interests.

Did the behaviour violate the norms of the community in which it occurred?

Of course, the converse is also true - there's behaviour that may be acceptable within your community but unacceptable in another community. It's easy to write off someone acting in a way that contravenes the standards of another community but wouldn't violate your expected behavioural standards - after all, if it wouldn't breach your standards, what grounds do you have for taking action?

But you need to consider that if someone consciously contravenes the behavioural standards of a community they've chosen to participate in, they may be willing to do the same in your community. If pushing boundaries is a frequent trait then it may not be too long until you discover that they're also pushing your boundaries.

Why do you care?

A community's code of conduct can be looked at in two ways - as a list of behaviours that will be punished if they occur, or as a list of behaviours that are unlikely to occur within that community. The former is probably the primary consideration when a community adopts a CoC, but the latter is how many people considering joining a community will think about it.

If your community includes individuals that are known to have engaged in behaviour that would violate your community standards, potential members or contributors may not trust that your CoC will function as adequate protection. A community that contains people known to have engaged in sexual harassment in other settings is unlikely to be seen as hugely welcoming, even if they haven't (as far as you know!) done so within your community. The way your members behave outside your community is going to be seen as saying something about your community, and that needs to be taken into account.

A second (and perhaps less obvious) aspect is that membership of some higher profile communities may be seen as lending general legitimacy to someone, and they may play off that to legitimise behaviour or views that would be seen as abhorrent by the community as a whole. If someone's anti-semitic views (for example) are seen as having more relevance because of their membership of your community, it's reasonable to think about whether keeping them in your community serves the best interests of your community.


I've said things like "considered" or "taken into account" a bunch here, and that's for a good reason - I don't know what the thresholds should be for any of these things, and there doesn't seem to be even a rough consensus in the wider community. We've seen cases in which communities have acted based on behaviour outside their community (eg, Debian removing Jacob Appelbaum after it was revealed that he'd sexually assaulted multiple people), but there's been no real effort to build a meaningful decision making framework around that.

As a result, communities struggle to make consistent decisions. It's unreasonable to expect individual communities to solve these problems on their own, but that doesn't mean we can ignore them. It's time to start coming up with a real set of best practices.

comment count unavailable comments

December 21, 2017 10:09 AM

December 14, 2017

Matthew Garrett: The Intel ME vulnerabilities are a big deal for some people, harmless for most

(Note: all discussion here is based on publicly disclosed information, and I am not speaking on behalf of my employers)

I wrote about the potential impact of the most recent Intel ME vulnerabilities a couple of weeks ago. The details of the vulnerability were released last week, and it's not absolutely the worst case scenario but it's still pretty bad. The short version is that one of the (signed) pieces of early bringup code for the ME reads an unsigned file from flash and parses it. Providing a malformed file could result in a buffer overflow, and a moderately complicated exploit chain could be built that allowed the ME's exploit mitigation features to be bypassed, resulting in arbitrary code execution on the ME.

Getting this file into flash in the first place is the difficult bit. The ME region shouldn't be writable at OS runtime, so the most practical way for an attacker to achieve this is to physically disassemble the machine and directly reprogram it. The AMT management interface may provide a vector for a remote attacker to achieve this - for this to be possible, AMT must be enabled and provisioned and the attacker must have valid credentials[1]. Most systems don't have provisioned AMT, so most users don't have to worry about this.

Overall, for most end users there's little to worry about here. But the story changes for corporate users or high value targets who rely on TPM-backed disk encryption. The way the TPM protects access to the disk encryption key is to insist that a series of "measurements" are correct before giving the OS access to the disk encryption key. The first of these measurements is obtained through the ME hashing the first chunk of the system firmware and passing that to the TPM, with the firmware then hashing each component in turn and storing those in the TPM as well. If someone compromises a later point of the chain then the previous step will generate a different measurement, preventing the TPM from releasing the secret.

However, if the first step in the chain can be compromised, all these guarantees vanish. And since the first step in the chain relies on the ME to be running uncompromised code, this vulnerability allows that to be circumvented. The attacker's malicious code can be used to pass the "good" hash to the TPM even if the rest of the firmware has been tampered with. This allows a sufficiently skilled attacker to extract the disk encryption key and read the contents of the disk[2].

In addition, TPMs can be used to perform something called "remote attestation". This allows the TPM to provide a signed copy of the recorded measurements to a remote service, allowing that service to make a policy decision around whether or not to grant access to a resource. Enterprises using remote attestation to verify that systems are appropriately patched (eg) before they allow them access to sensitive material can no longer depend on those results being accurate.

Things are even worse for people relying on Intel's Platform Trust Technology (PTT), which is an implementation of a TPM that runs on the ME itself. Since this vulnerability allows full access to the ME, an attacker can obtain all the private key material held in the PTT implementation and, effectively, adopt the machine's cryptographic identity. This allows them to impersonate the system with arbitrary measurements whenever they want to. This basically renders PTT worthless from an enterprise perspective - unless you've maintained physical control of a machine for its entire lifetime, you have no way of knowing whether it's had its private keys extracted and so you have no way of knowing whether the attestation attempt is coming from the machine or from an attacker pretending to be that machine.

Bootguard, the component of the ME that's responsible for measuring the firmware into the TPM, is also responsible for verifying that the firmware has an appropriate cryptographic signature. Since that can be bypassed, an attacker can reflash modified firmware that can do pretty much anything. Yes, that probably means you can use this vulnerability to install Coreboot on a system locked down using Bootguard.

(An aside: The Titan security chips used in Google Cloud Platform sit between the chipset and the flash and verify the flash before permitting anything to start reading from it. If an attacker tampers with the ME firmware, Titan should detect that and prevent the system from booting. However, I'm not involved in the Titan project and don't know exactly how this works, so don't take my word for this)

Intel have published an update that fixes the vulnerability, but it's pretty pointless - there's apparently no rollback protection in the affected 11.x MEs, so while the attacker is modifying your flash to insert the payload they can just downgrade your ME firmware to a vulnerable version. Version 12 will reportedly include optional rollback protection, which is little comfort to anyone who has current hardware. Basically, anyone whose threat model depends on the low-level security of their Intel system is probably going to have to buy new hardware.

This is a big deal for enterprises and any individuals who may be targeted by skilled attackers who have physical access to their hardware, and entirely irrelevant for almost anybody else. If you don't know that you should be worried, you shouldn't be.

[1] Although admins should bear in mind that any system that hasn't been patched against CVE-2017-5689 considers an empty authentication cookie to be a valid credential

[2] TPMs are not intended to be strongly tamper resistant, so an attacker could also just remove the TPM, decap it and (with some effort) extract the key that way. This is somewhat more time consuming than just reflashing the firmware, so the ME vulnerability still amounts to a change in attack practicality.

comment count unavailable comments

December 14, 2017 01:31 AM

December 12, 2017

Matthew Garrett: Eben Moglen is no longer a friend of the free software community

(Note: While the majority of the events described below occurred while I was a member of the board of directors of the Free Software Foundation, I am no longer. This is my personal position and should not be interpreted as the opinion of any other organisation or company I have been affiliated with in any way)

Eben Moglen has done an amazing amount of work for the free software community, serving on the board of the Free Software Foundation and acting as its general counsel for many years, leading the drafting of GPLv3 and giving many forceful speeches on the importance of free software. However, his recent behaviour demonstrates that he is no longer willing to work with other members of the community, and we should reciprocate that.

In early 2016, the FSF board became aware that Eben was briefing clients on an interpretation of the GPL that was incompatible with that held by the FSF. He later released this position publicly with little coordination with the FSF, which was used by Canonical to justify their shipping ZFS in a GPL-violating way. He had provided similar advice to Debian, who were confused about the apparent conflict between the FSF's position and Eben's.

This situation was obviously problematic - Eben is clearly free to provide whatever legal opinion he holds to his clients, but his very public association with the FSF caused many people to assume that these positions were held by the FSF and the FSF were forced into the position of publicly stating that they disagreed with legal positions held by their general counsel. Attempts to mediate this failed, and Eben refused to commit to working with the FSF on avoiding this sort of situation in future[1].

Around the same time, Eben made legal threats towards another project with ties to FSF. These threats were based on a license interpretation that ran contrary to how free software licenses had been interpreted by the community for decades, and was made without any prior discussion with the FSF (2017-12-11 update: page 126 of this document includes the email in which Eben asserts that the Software Freedom Conservancy is engaging in plagiarism by making use of appropriately credited material released under a Creative Commons license). This, in conjunction with his behaviour over the ZFS issue, led to him stepping down as the FSF's general counsel.

Throughout this period, Eben disparaged FSF staff and other free software community members in various semi-public settings. In doing so he harmed the credibility of many people who have devoted significant portions of their lives to aiding the free software community. At Libreplanet earlier this year he made direct threats against an attendee - this was reported as a violation of the conference's anti-harassment policy.

Eben has acted against the best interests of an organisation he publicly represented. He has threatened organisations and individuals who work to further free software. His actions are no longer to the benefit of the free software community and the free software community should cease associating with him.

[1] Contrary to the claim provided here, Bradley was not involved in this process.

(Edit to add: various people have asked for more details of some of the accusations here. Eben is influential in many areas, and publicising details without the direct consent of his victims may put them at professional risk. I'm aware that this reduces my credibility, and it's entirely reasonable for people to choose not to believe me as a result. I will add that I said much of this several months ago, so I'm not making stuff up in response to recent events)

comment count unavailable comments

December 12, 2017 05:59 AM

Linux Plumbers Conference: Linux Plumbers Conference 2018 site and dates

We are pleased to announce that the 2018 edition of the Linux Plumbers Conference will take place in Vancouver, British Columbia, Canada at the Sheraton Vancouver Wall Centre. It will be colocated with the Linux Kernel Summit. LPC will run from November 13, 2018 (Tuesday) to November 15, 2018 (Thursday).

We look forward to another great edition of LPC and to seeing you all in Vancouver!

Stay tuned for more information as the Linux Plumbers Conference committee starts planning for the 2018 conference.

The LPC Planning Committee.

December 12, 2017 12:59 AM

December 05, 2017

Pete Zaitcev: Marcan: Debugging an evil Go runtime bug

Fascinating, and a few reactions spring to mind.

First, I have to admit, the resolution simultaneously blew me away and was very nostalgic. Forgetting that some instructions are not atomic is just the thing that I saw people commit in architecture support in kernel (I don't remember if I ever used an opportunity to do it, it's quite possible, even on sun4c).

Also, my (former) colleague DaveJ (who's now consumed by Facebook -- I remember complaints about useful people "gone to Google and never heard from again", but Facebook is the same hole nowadays) once said, approximately: "Everyone loves to crap on Gentoo hackers for silly optimizations and being otherwise unprofessional, but when it's something interesting it's always (or often) them." Gentoo crew is underrated, including their userbase.

And finally:

Go also happens to have a (rather insane, in my opinion) policy of reinventing its own standard library, so it does not use any of the standard Linux glibc code to call vDSO, but rather rolls its own calls (and syscalls too).

Usually you hear about this when their DNS resolver blows up, but it can be elsewhere, as in this case.

(h/t to a chatter in #animeblogger)

UPDATE: CKS adds that some UNIXen officially require applications to use libc.

December 05, 2017 08:20 PM

November 28, 2017

Matthew Garrett: Potential impact of the Intel ME vulnerability

(Note: this is my personal opinion based on public knowledge around this issue. I have no knowledge of any non-public details of these vulnerabilities, and this should not be interpreted as the position or opinion of my employer)

Intel's Management Engine (ME) is a small coprocessor built into the majority of Intel CPU chipsets[0]. Older versions were based on the ARC architecture[1] running an embedded realtime operating system, but from version 11 onwards they've been small x86 cores running Minix. The precise capabilities of the ME have not been publicly disclosed, but it is at minimum capable of interacting with the network[2], display[3], USB, input devices and system flash. In other words, software running on the ME is capable of doing a lot, without requiring any OS permission in the process.

Back in May, Intel announced a vulnerability in the Advanced Management Technology (AMT) that runs on the ME. AMT offers functionality like providing a remote console to the system (so IT support can connect to your system and interact with it as if they were physically present), remote disk support (so IT support can reinstall your machine over the network) and various other bits of system management. The vulnerability meant that it was possible to log into systems with enabled AMT with an empty authentication token, making it possible to log in without knowing the configured password.

This vulnerability was less serious than it could have been for a couple of reasons - the first is that "consumer"[4] systems don't ship with AMT, and the second is that AMT is almost always disabled (Shodan found only a few thousand systems on the public internet with AMT enabled, out of many millions of laptops). I wrote more about it here at the time.

How does this compare to the newly announced vulnerabilities? Good question. Two of the announced vulnerabilities are in AMT. The previous AMT vulnerability allowed you to bypass authentication, but restricted you to doing what AMT was designed to let you do. While AMT gives an authenticated user a great deal of power, it's also designed with some degree of privacy protection in mind - for instance, when the remote console is enabled, an animated warning border is drawn on the user's screen to alert them.

This vulnerability is different in that it allows an authenticated attacker to execute arbitrary code within the AMT process. This means that the attacker shouldn't have any capabilities that AMT doesn't, but it's unclear where various aspects of the privacy protection are implemented - for instance, if the warning border is implemented in AMT rather than in hardware, an attacker could duplicate that functionality without drawing the warning. If the USB storage emulation for remote booting is implemented as a generic USB passthrough, the attacker could pretend to be an arbitrary USB device and potentially exploit the operating system through bugs in USB device drivers. Unfortunately we don't currently know.

Note that this exploit still requires two things - first, AMT has to be enabled, and second, the attacker has to be able to log into AMT. If the attacker has physical access to your system and you don't have a BIOS password set, they will be able to enable it - however, if AMT isn't enabled and the attacker isn't physically present, you're probably safe. But if AMT is enabled and you haven't patched the previous vulnerability, the attacker will be able to access AMT over the network without a password and then proceed with the exploit. This is bad, so you should probably (1) ensure that you've updated your BIOS and (2) ensure that AMT is disabled unless you have a really good reason to use it.

The AMT vulnerability applies to a wide range of versions, everything from version 6 (which shipped around 2008) and later. The other vulnerability that Intel describe is restricted to version 11 of the ME, which only applies to much more recent systems. This vulnerability allows an attacker to execute arbitrary code on the ME, which means they can do literally anything the ME is able to do. This probably also means that they are able to interfere with any other code running on the ME. While AMT has been the most frequently discussed part of this, various other Intel technologies are tied to ME functionality.

Intel's Platform Trust Technology (PTT) is a software implementation of a Trusted Platform Module (TPM) that runs on the ME. TPMs are intended to protect access to secrets and encryption keys and record the state of the system as it boots, making it possible to determine whether a system has had part of its boot process modified and denying access to the secrets as a result. The most common usage of TPMs is to protect disk encryption keys - Microsoft Bitlocker defaults to storing its encryption key in the TPM, automatically unlocking the drive if the boot process is unmodified. In addition, TPMs support something called Remote Attestation (I wrote about that here), which allows the TPM to provide a signed copy of information about what the system booted to a remote site. This can be used for various purposes, such as not allowing a compute node to join a cloud unless it's booted the correct version of the OS and is running the latest firmware version. Remote Attestation depends on the TPM having a unique cryptographic identity that is tied to the TPM and inaccessible to the OS.

PTT allows manufacturers to simply license some additional code from Intel and run it on the ME rather than having to pay for an additional chip on the system motherboard. This seems great, but if an attacker is able to run code on the ME then they potentially have the ability to tamper with PTT, which means they can obtain access to disk encryption secrets and circumvent Bitlocker. It also means that they can tamper with Remote Attestation, "attesting" that the system booted a set of software that it didn't or copying the keys to another system and allowing that to impersonate the first. This is, uh, bad.

Intel also recently announced Intel Online Connect, a mechanism for providing the functionality of security keys directly in the operating system. Components of this are run on the ME in order to avoid scenarios where a compromised OS could be used to steal the identity secrets - if the ME is compromised, this may make it possible for an attacker to obtain those secrets and duplicate the keys.

It's also not entirely clear how much of Intel's Secure Guard Extensions (SGX) functionality depends on the ME. The ME does appear to be required for SGX Remote Attestation (which allows an application using SGX to prove to a remote site that it's the SGX app rather than something pretending to be it), and again if those secrets can be extracted from a compromised ME it may be possible to compromise some of the security assumptions around SGX. Again, it's not clear how serious this is because it's not publicly documented.

Various other things also run on the ME, including stuff like video DRM (ensuring that high resolution video streams can't be intercepted by the OS). It may be possible to obtain encryption keys from a compromised ME that allow things like Netflix streams to be decoded and dumped. From a user privacy or security perspective, these things seem less serious.

The big problem at the moment is that we have no idea what the actual process of compromise is. Intel state that it requires local access, but don't describe what kind. Local access in this case could simply require the ability to send commands to the ME (possible on any system that has the ME drivers installed), could require direct hardware access to the exposed ME (which would require either kernel access or the ability to install a custom driver) or even the ability to modify system flash (possible only if the attacker has physical access and enough time and skill to take the system apart and modify the flash contents with an SPI programmer). The other thing we don't know is whether it's possible for an attacker to modify the system such that the ME is persistently compromised or whether it needs to be re-compromised every time the ME reboots. Note that even the latter is more serious than you might think - the ME may only be rebooted if the system loses power completely, so even a "temporary" compromise could affect a system for a long period of time.

It's also almost impossible to determine if a system is compromised. If the ME is compromised then it's probably possible for it to roll back any firmware updates but still report that it's been updated, giving admins a false sense of security. The only way to determine for sure would be to dump the system flash and compare it to a known good image. This is impractical to do at scale.

So, overall, given what we know right now it's hard to say how serious this is in terms of real world impact. It's unlikely that this is the kind of vulnerability that would be used to attack individual end users - anyone able to compromise a system like this could just backdoor your browser instead with much less effort, and that already gives them your banking details. The people who have the most to worry about here are potential targets of skilled attackers, which means activists, dissidents and companies with interesting personal or business data. It's hard to make strong recommendations about what to do here without more insight into what the vulnerability actually is, and we may not know that until this presentation next month.

Summary: Worst case here is terrible, but unlikely to be relevant to the vast majority of users.

[0] Earlier versions of the ME were built into the motherboard chipset, but as portions of that were incorporated onto the CPU package the ME followedEdit: Apparently I was wrong and it's still on the chipset
[1] A descendent of the SuperFX chip used in Super Nintendo cartridges such as Starfox, because why not
[2] Without any OS involvement for wired ethernet and for wireless networks in the system firmware, but requires OS support for wireless access once the OS drivers have loaded
[3] Assuming you're using integrated Intel graphics
[4] "Consumer" is a bit of a misnomer here - "enterprise" laptops like Thinkpads ship with AMT, but are often bought by consumers.

comment count unavailable comments

November 28, 2017 03:45 AM

November 26, 2017

Michael Kerrisk (manpages): Next Linux/UNIX System Programming course in Munich, 5-9 February, 2018

There are still some places free for my next 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 5-9 February 2018.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., proprietary embedded/realtime operaring systems or Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory), and network programming (sockets).
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive printed and electronic copies of TLPI, along with a 600-page course book that includes all slides presented in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via

November 26, 2017 07:38 PM

Michael Kerrisk (manpages): man-pages-4.14 is released

I've released man-pages-4.14. The release tarball is available on The browsable online pages can be found on The Git repository for man-pages is available on

This release resulted from patches, bug reports, reviews, and comments from 71 contributors. Nearly 400 commits changed more than 160 pages. In addition, 4 new manual pages were added.

Among the more significant changes in man-pages-4.14 are the following:

November 26, 2017 07:14 PM

Michael Kerrisk (manpages): man-pages-4.13 is released

I've released man-pages-4.13. The release tarball is available on The browsable online pages can be found on The Git repository for man-pages is available on

This release resulted from patches, bug reports, reviews, and comments from around 40 contributors. The release is rather larger than average. (The context diff runs to more than 90k lines.) The release includes more than 350 commits and contains some fairly wide-ranging formatting fix-ups that meant that all 1028 existing manual pages saw some change(s). In addition, 5 new manual pages were added.

Among the more significant changes in man-pages-4.13 are the following:

A special thanks to Eugene Syromyatnikov, who contributed 30 patches to this release!

November 26, 2017 10:17 AM

November 25, 2017

Linux Plumbers Conference: Audio Recordings Posted

This year, by way of an experiment we tried recording the audio through the sound system of the talks track and one Microconference track (Those in Platinum C).  Unfortunately, because of technical problems, we have no recordings from Wednesday, but mostly complete ones from Thursday and Friday (Missing TPM Software Stack Status and Managing the Impact of Growing CPU Register State).

To find the audio, go to the full description of the talk or Microconference (click on the title) and scroll down to the bottom of the Abstract (just before the Tags section).  The audio is downloadable mp3, so you can either stream directly to your browser or download for later offline listening.

If you find the audio useful (or not), please let us know ( so we can plan for doing it again next year.

November 25, 2017 03:30 PM

November 24, 2017

Paul E. Mc Kenney: Parallel Programming: November 2017 Update

This USA Thanksgiving holiday weekend features a new release of Is Parallel Programming Hard, And, If So, What Can You Do About It?.

This update includes more formatting and build-system improvements, bibliography updates, and better handling of listings, all courtesy of Akira Yokosawa; numerous fixes and updates from Junchang Wang, Pierre Kuo, SeongJae Park, and Yubin Ruan; a new futures section on quantum computing; updates to the formal-verification section based on recent collaborations; and a full rewrite of the memory-barriers section, which is now its own chapter. This rewrite was of course based on recent work with my partners in memory-ordering crime, Jade Alglave, Luc Maranget, Andrea Parri, and Alan Stern.

As always, git:// will be updated in real time.

November 24, 2017 11:21 PM

November 20, 2017

Davidlohr Bueso: Linux v4.14: Performance Goodies

Last week Linus released the v4.14 kernel with some noticeable performance changes. The following is an unsorted and incomplete list of changes that went in. Note that the term 'performance' can be vague in that some gains in one area can negatively affect another, so take everything with a grain of salt and reach your own conclusions.

sysvipc: scale key management

We began using relativistic hash tables for managing ipc keys, which greatly improves the current O(N) lookups. As such, ipc_findkey() calls are significantly faster (+800% in some reaim file benchmarks) and we need not iterate all elements each time. Improvements are even seen in  scenarios where the amount of keys is but a handful, so this is pretty much a win from any standpoint.
[Commit 0cfb6aee70bd]

interval-tree: fast overlap detection

With the new extended rbtree api to cache the smallest (leftmost) node, instead of doing O(logN) walks to the end of the tree, we have the pointer always available. This allows to extend and complete the fast overlap detection for interval trees to speedup (sub)tree searches if the interval is completely to the left or right of the current tree's max interval. In addition, a number of other users that traverse rbtrees are updated to use the new rbtree_cached, such as epoll, procfs and cfq.
[Commit  cd9e61ed1eeb, 410bd5ecb276, 2554db916586, b2ac2ea6296af808c13fd373]

sched: waitqueue bookmarks

A situation where constant NUMA migrations of a hot-page triggered large number of page waiters being awoken exhibited some issues in the waitqueue implementation. In such cases, large number of wakeups will occur while holding a spinlock, which causes significant unbounded lantencies. Unlike wake_qs (used in futexes and locks), where batched wakeups are done without the lock, waitqueue bookmarks allow to to pause and stop iterating the wake list such that another process has a chance to acquire the lock. Then it can resume where it left off.
[Commit 3510ca20ece, 2554db916586, 11a19c7b099f]

 x86 PCID (Process Context Identifier)

This is a 64-bit hardware feature that allows tagging TLBs such that upon context switching, only flush the required entries. For virtualization (VT-x) this has supported similar features for a while, via vpid. On other archs it is called address space ID. Linux's support is somewhat special. In order to avoid the x86 limitations of 4096 IDs (or processes), the implementation actually uses a PCID to identify a recently-used mm (process address space) on a per-cpu basis. An mm has no fixed PCID binding at all; instead, it is given a fresh PCID each time it's loaded, except in cases where we want to preserve the TLB, in which case we reuse a recent value. To illustrate, a workload under kvm that ping pongs two processes, dTLB misses were reduced by ~17x.
[Commit f39681ed0f48, b0579ade7cd8, 94b1b03b519b, 43858b4f25cf, cba4671af7550790c9aad849, 660da7c9228f, 10af6235e0d3


ORC (Oops Rewind Capability) Unwinder

The much acclaimed replacement to frame pointers and the (out of tree) DWARF unwinder. Through simplicity, the end result is faster profiling, such as for perf. Experiments show a 20x performance increase using ORC vs DWARF while calling save_stack_trace 20,000 times via single vfs_write. With respect to frame pointers, the ORC unwinder is more accurate across interrupt entry frames and enables a 5-10% performance improvement across the entire kernel compared to frame pointers.
[Commit ee9f8fce9964, 39358a033b2e]

mm: choose swap device according to numa node

If the system has more than one swap device and swap device has the node information, we can make use of this information to decide which swap device to use in get_swap_pages() to get better performance. This change replaces a single global swap_avail list with a per-numa-node list: each numa node sees its own priority based list of available swap devices. Swap device's priority can be promoted on its matching node's swap_avail_list. Shows ~25% improvements for a 2 node box, benchmaring random writes on mmaped region withSSDs attached to each node, ensuring swapping in and out.
[Commit a2468cc9bfdf]

mm: reduce cost of page allocator

Upon page allocation, the per-zone statistics are updated, introducing overhead in the form of cacheline bouncing; responsible for ~30% of all CPU cycles  for allocating a single page. The networking folks have been known to complain about the performance degradation when dealing with the memory management subsystem, particularly the page allocator. The fact that these NUMA associated counters are rarely used allows the counter threshold that determines the frequency of updating the global counter with the percpu counters (hence cacheline bouncing) to be increased. This means hurting readers, but that's the point.
[Commit 3a321d2a3dde, 1d90ca897cb0, 638032224ed7]

archs: multibyte memset

New calls memset16(), memset32() and memset64() are introduced, which are like memset(), but allow the caller to fill the destination with a value larger than a single byte. There are a number of places in the kernel that can benefit from using an optimized function rather than a loop; sometimes text size, sometimes speed, and sometimes both. When supported by the architecture, use a single instruction, such as stosq (stores a quadword) in x86-64. Zram shows a 7% performance improvement on x86 with a 100Mb non-zero deduplicate data. If not available, default back to the slower loop implementation.
[Commits  3b3c4babd898, 03270c13c5ff, 4c51248533ad, 48ad1abef402]

powerpc: improve TLB flushing

A few optimisations were also added to the radix MMU TLB flushing, mostly to avoid unnecessary Page Walk Cache (PWC) flushes when the structure of the tree is not changing.
[Commit a46cc7a90fd8, 424de9c6e3f8]

There are plenty of other performance optimizations out there, including ext4 parallel file creation and quotas, additional memset improvements in sparc, transparent hugepage migrations and swap improvements, ipv6 (ip6_route_output()) optimizations, etc. Again, the list here is partial and biased by me. For more list of features play with 'git log' or visit lwn (part1, part2) and kernelnewbies.

November 20, 2017 03:50 PM

November 15, 2017

Kees Cook: security things in Linux v4.14

Previously: v4.13.

Linux kernel v4.14 was released this last Sunday, and there’s a bunch of security things I think are interesting:

vmapped kernel stack on arm64
Similar to the same feature on x86, Mark Rutland and Ard Biesheuvel implemented CONFIG_VMAP_STACK for arm64, which moves the kernel stack to an isolated and guard-paged vmap area. With traditional stacks, there were two major risks when exhausting the stack: overwriting the thread_info structure (which contained the addr_limit field which is checked during copy_to/from_user()), and overwriting neighboring stacks (or other things allocated next to the stack). While arm64 previously moved its thread_info off the stack to deal with the former issue, this vmap change adds the last bit of protection by nature of the vmap guard pages. If the kernel tries to write past the end of the stack, it will hit the guard page and fault. (Testing for this is now possible via LKDTM’s STACK_GUARD_PAGE_LEADING/TRAILING tests.)

One aspect of the guard page protection that will need further attention (on all architectures) is that if the stack grew because of a giant Variable Length Array on the stack (effectively an implicit alloca() call), it might be possible to jump over the guard page entirely (as seen in the userspace Stack Clash attacks). Thankfully the use of VLAs is rare in the kernel. In the future, hopefully we’ll see the addition of PaX/grsecurity’s STACKLEAK plugin which, in addition to its primary purpose of clearing the kernel stack on return to userspace, makes sure stack expansion cannot skip over guard pages. This “stack probing” ability will likely also become directly available from the compiler as well.

set_fs() balance checking
Related to the addr_limit field mentioned above, another class of bug is finding a way to force the kernel into accidentally leaving addr_limit open to kernel memory through an unbalanced call to set_fs(). In some areas of the kernel, in order to reuse userspace routines (usually VFS or compat related), code will do something like: set_fs(KERNEL_DS); ...some code here...; set_fs(USER_DS);. When the USER_DS call goes missing (usually due to a buggy error path or exception), subsequent system calls can suddenly start writing into kernel memory via copy_to_user (where the “to user” really means “within the addr_limit range”).

Thomas Garnier implemented USER_DS checking at syscall exit time for x86, arm, and arm64. This means that a broken set_fs() setting will not extend beyond the buggy syscall that fails to set it back to USER_DS. Additionally, as part of the discussion on the best way to deal with this feature, Christoph Hellwig and Al Viro (and others) have been making extensive changes to avoid the need for set_fs() being used at all, which should greatly reduce the number of places where it might be possible to introduce such a bug in the future.

SLUB freelist hardening
A common class of heap attacks is overwriting the freelist pointers stored inline in the unallocated SLUB cache objects. PaX/grsecurity developed an inexpensive defense that XORs the freelist pointer with a global random value (and the storage address). Daniel Micay improved on this by using a per-cache random value, and I refactored the code a bit more. The resulting feature, enabled with CONFIG_SLAB_FREELIST_HARDENED, makes freelist pointer overwrites very hard to exploit unless an attacker has found a way to expose both the random value and the pointer location. This should render blind heap overflow bugs much more difficult to exploit.

Additionally, Alexander Popov implemented a simple double-free defense, similar to the “fasttop” check in the GNU C library, which will catch sequential free()s of the same pointer. (And has already uncovered a bug.)

Future work would be to provide similar metadata protections to the SLAB allocator (though SLAB doesn’t store its freelist within the individual unused objects, so it has a different set of exposures compared to SLUB).

setuid-exec stack limitation
Continuing the various additional defenses to protect against future problems related to userspace memory layout manipulation (as shown most recently in the Stack Clash attacks), I implemented an 8MiB stack limit for privileged (i.e. setuid) execs, inspired by a similar protection in grsecurity, after reworking the secureexec handling by LSMs. This complements the unconditional limit to the size of exec arguments that landed in v4.13.

randstruct automatic struct selection
While the bulk of the port of the randstruct gcc plugin from grsecurity landed in v4.13, the last of the work needed to enable automatic struct selection landed in v4.14. This means that the coverage of randomized structures, via CONFIG_GCC_PLUGIN_RANDSTRUCT, now includes one of the major targets of exploits: function pointer structures. Without knowing the build-randomized location of a callback pointer an attacker needs to overwrite in a structure, exploits become much less reliable.

structleak passed-by-reference variable initialization
Ard Biesheuvel enhanced the structleak gcc plugin to initialize all variables on the stack that are passed by reference when built with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL. Normally the compiler will yell if a variable is used before being initialized, but it silences this warning if the variable’s address is passed into a function call first, as it has no way to tell if the function did actually initialize the contents. So the plugin now zero-initializes such variables (if they hadn’t already been initialized) before the function call that takes their address. Enabling this feature has a small performance impact, but solves many stack content exposure flaws. (In fact at least one such flaw reported during the v4.15 development cycle was mitigated by this plugin.)

improved boot entropy
Laura Abbott and Daniel Micay improved early boot entropy available to the stack protector by both moving the stack protector setup later in the boot, and including the kernel command line in boot entropy collection (since with some devices it changes on each boot).

eBPF JIT for 32-bit ARM
The ARM BPF JIT had been around a while, but it didn’t support eBPF (and, as a result, did not provide constant value blinding, which meant it was exposed to being used by an attacker to build arbitrary machine code with BPF constant values). Shubham Bansal spent a bunch of time building a full eBPF JIT for 32-bit ARM which both speeds up eBPF and brings it up to date on JIT exploit defenses in the kernel.

seccomp improvements
Tyler Hicks addressed a long-standing deficiency in how seccomp could log action results. In addition to creating a way to mark a specific seccomp filter as needing to be logged with SECCOMP_FILTER_FLAG_LOG, he added a new action result, SECCOMP_RET_LOG. With these changes in place, it should be much easier for developers to inspect the results of seccomp filters, and for process launchers to generate logs for their child processes operating under a seccomp filter.

Additionally, I finally found a way to implement an often-requested feature for seccomp, which was to kill an entire process instead of just the offending thread. This was done by creating the SECCOMP_RET_ACTION_FULL mask (née SECCOMP_RET_ACTION) and implementing SECCOMP_RET_KILL_PROCESS.

That’s it for now; please let me know if I missed anything. The v4.15 merge window is now open!

© 2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

November 15, 2017 05:23 AM

November 14, 2017

James Morris: Save the Dates: Linux Security Summit Events for 2018

There will be a new European version of the Linux Security Summit for 2018, in addition to the established North American event.

The dates and locations are as follows:

Stay tuned for CFP announcements!


November 14, 2017 11:24 PM

November 13, 2017

Gustavo F. Padovan: The linuxdev-br conference was a success!

Last Saturday we had the first edition of the Linux Developer Conference Brazil. A conference  born from the need of a meeting point, in Brazil, for the developers,  enthusiasts and companies of FOSS projects that forms the Core of modern Linux systems, either it be in smartphones, cloud, cars or TVs.

After a few years traveling to conferences around the world I felt that we didn’t have in Brazil any forum like the ones outside of Brazil, so I came up with the idea of building one myself. So I invited two friends of mine to take on the challenge, Bruno Dilly and João Moreira. We also got help from University of Campinas that allowed us to use their space, many thanks to Professor Islene Garcia.

Together we made linuxdev-br was a success, the talks were great. Almost 100 people attended the conference, some of them traveling from quite far places in Brazil. During the day we had João Avelino Bellomo Filho talking about SystemTap, Lucas Villa Real talking about Virtualization with GoboLinux’ Runner and Felipe Neves talking about the Zephyr project. In the afternoon we had Fabio Estevam talking about Device Tree, Arnaldo Melo on perf tools and João Moreira on Live Patching. All videos are available here (in Portuguese).

To finish the day we had a Happy Hour paid by the sponsors of the conference. It was a great opportunity to have some beers and interact with other attendees.

I want to thank you everyone that joined us in the first edition, next year it will be even better. By the way, talking about next year, the conference idiom next year will be English. We want linuxdev-br to become part of the international cycle of conferences! Stay tuned for next year, if you want to take part, talk or sponsor please reach us at

November 13, 2017 03:33 PM

November 07, 2017

Dave Airlie (blogspot): radv on Ubuntu broken in distro packages

It appears that Ubuntu mesa 17.2.2 packages that ship radv, have patches to enable MIR support. These patches actually just break radv instead. I'd seen some people complain that simple apps don't work on radv, and saying radv wasn't ready for use and how could anyone thing of using it and just wondered what they had been smoking as Fedora was working fine. Hopefully Canonical can sort that out ASAP.

November 07, 2017 07:35 PM

Pete Zaitcev: ProxyFS opened, I think

Not exactly sure if that thing is complete, and I didn't attend the announcement (at OpenStack Summit in Sydney, presumably), but it appears that SwiftStack open-sourced ProxyFS. The project was announced to the world a year an a half ago.

UPDATE: The Swiftstack product is called "File Access", but AFAIK the project is still "ProxyFS".

November 07, 2017 02:25 AM

October 26, 2017

Pete Zaitcev: Polite like Sphinx

Exception occurred:
   File "/usr/lib/python2.7/site-packages/sphinx/util/", line 363, in filter
     raise SphinxWarning(message % record.args)
TypeError: not all arguments converted during string formatting
The full traceback has been saved in /tmp/sphinx-err-SD2Ra4.log, if you want to report the issue to the developers.

Love how modest this package is.

October 26, 2017 08:26 PM

October 21, 2017

Pavel Machek: Prague and Nokia N900s

If you are travelling to Prague to ELCE, and have Nokia N900, N9 or N950, or spare parts for them, please take them with you. I may help you install postmarket os there (, can probably charge N900 that does not charge, and spare parts would be useful for me. I have a talk about cameras, and will be around... .

October 21, 2017 10:25 PM

October 20, 2017

James Morris: Security Session at the 2017 Kernel Summit

For folks attending Open Source Summit Europe next week in Prague, note that there is a security session planned as part of the co-located Kernel Summit technical track.

This year, the Kernel Summit is divided into two components:

  1. An invitation-only maintainer summit of 30 people total, and;
  2. An open kernel summit technical track which is open to all attendees of OSS Europe.

The security session is part of the latter.  The preliminary agenda for the kernel summit technical track was announced by Ted Ts’o here:

There is also a preliminary agenda for the security session, here:

Currently, the agenda includes an update from Kees Cook on the Kernel Self Protection Project, and an update from Jarkko Sakkinen on TPM support.  I’ll provide a summary of the recent Linux Security Summit, depending on available time, perhaps focusing on security namespacing issues.

This agenda is subject to change and if you have any topics to propose, please send an email to the ksummit-discuss list.


October 20, 2017 12:23 AM

October 16, 2017

Greg Kroah-Hartman: Linux Kernel Community Enforcement Statement FAQ

Based on the recent Linux Kernel Community Enforcement Statement and the article describing the background and what it means , here are some Questions/Answers to help clear things up. These are based on questions that came up when the statement was discussed among the initial round of over 200 different kernel developers.

Q: Is this changing the license of the kernel?

A: No.

Q: Seriously? It really looks like a change to the license.

A: No, the license of the kernel is still GPLv2, as before. The kernel developers are providing certain additional promises that they encourage users and adopters to rely on. And by having a specific acking process it is clear that those who ack are making commitments personally (and perhaps, if authorized, on behalf of the companies that employ them). There is nothing that says those commitments are somehow binding on anyone else. This is exactly what we have done in the past when some but not all kernel developers signed off on the driver statement.

Q: Ok, but why have this “additional permissions” document?

A: In order to help address problems caused by current and potential future copyright “trolls” aka monetizers.

Q: Ok, but how will this help address the “troll” problem?

A: “Copyright trolls” use the GPL-2.0’s immediate termination and the threat of an immediate injunction to turn an alleged compliance concern into a contract claim that gives the troll an automatic claim for money damages. The article by Heather Meeker describes this quite well, please refer to that for more details. If even a short delay is inserted for coming into compliance, that delay disrupts this expedited legal process.

By simply saying, “We think you should have 30 days to come into compliance”, we undermine that “immediacy” which supports the request to the court for an immediate injunction. The threat of an immediate junction was used to get the companies to sign contracts. Then the troll goes back after the same company for another known violation shortly after and claims they’re owed the financial penalty for breaking the contract. Signing contracts to pay damages to financially enrich one individual is completely at odds with our community’s enforcement goals.

We are showing that the community is not out for financial gain when it comes to license issues – though we do care about the company coming into compliance.  All we want is the modifications to our code to be released back to the public, and for the developers who created that code to become part of our community so that we can continue to create the best software that works well for everyone.

This is all still entirely focused on bringing the users into compliance. The 30 days can be used productively to determine exactly what is wrong, and how to resolve it.

Q: Ok, but why are we referencing GPL-3.0?

A: By using the terms from the GPLv3 for this, we use a very well-vetted and understood procedure for granting the opportunity to come fix the failure and come into compliance. We benefit from many months of work to reach agreement on a termination provision that worked in legal systems all around the world and was entirely consistent with Free Software principles.

Q: But what is the point of the “non-defensive assertion of rights” disclaimer?

A: If a copyright holder is attacked, we don’t want or need to require that copyright holder to give the party suing them an opportunity to cure. The “non-defensive assertion of rights” is just a way to leave everything unchanged for a copyright holder that gets sued.  This is no different a position than what they had before this statement.

Q: So you are ok with using Linux as a defensive copyright method?

A: There is a current copyright troll problem that is undermining confidence in our community – where a “bad actor” is attacking companies in a way to achieve personal gain. We are addressing that issue. No one has asked us to make changes to address other litigation.

Q: Ok, this document sounds like it was written by a bunch of big companies, who is behind the drafting of it and how did it all happen?

A: Grant Likely, the chairman at the time of the Linux Foundation’s Technical Advisory Board (TAB), wrote the first draft of this document when the first copyright troll issue happened a few years ago. He did this as numerous companies and developers approached the TAB asking that the Linux kernel community do something about this new attack on our community. He showed the document to a lot of kernel developers and a few company representatives in order to get feedback on how it should be worded. After the troll seemed to go away, this work got put on the back-burner. When the copyright troll showed back up, along with a few other “copycat” like individuals, the work on the document was started back up by Chris Mason, the current chairman of the TAB. He worked with the TAB members, other kernel developers, lawyers who have been trying to defend these claims in Germany, and the TAB members’ Linux Foundation’s lawyers, in order to rework the document so that it would actually achieve the intended benefits and be useful in stopping these new attacks. The document was then reviewed and revised with input from Linus Torvalds and finally a document that the TAB agreed would be sufficient was finished. That document was then sent to over 200 of the most active kernel developers for the past year by Greg Kroah-Hartman to see if they, or their company, wished to support the document. That produced the initial “signatures” on the document, and the acks of the patch that added it to the Linux kernel source tree.

Q: How do I add my name to the document?

A: If you are a developer of the Linux kernel, simply send Greg a patch adding your name to the proper location in the document (sorting the names by last name), and he will be glad to accept it.

Q: How can my company show its support of this document?

A: If you are a developer working for a company that wishes to show that they also agree with this document, have the developer put the company name in ‘(’ ‘)’ after the developer’s name. This shows that both the developer, and the company behind the developer are in agreement with this statement.

Q: How can a company or individual that is not part of the Linux kernel community show its support of the document?

A: Become part of our community! Send us patches, surely there is something that you want to see changed in the kernel. If not, wonderful, post something on your company web site, or personal blog in support of this statement, we don’t mind that at all.

Q: I’ve been approached by a copyright troll for Netfilter. What should I do?

A: Please see the Netfilter FAQ here for how to handle this

Q: I have another question, how do I ask it?

A: Email Greg or the TAB, and they will be glad to help answer them.

October 16, 2017 09:05 AM

Greg Kroah-Hartman: Linux Kernel Community Enforcement Statement

By Greg Kroah-Hartman, Chris Mason, Rik van Riel, Shuah Khan, and Grant Likely

The Linux kernel ecosystem of developers, companies and users has been wildly successful by any measure over the last couple decades. Even today, 26 years after the initial creation of the Linux kernel, the kernel developer community continues to grow, with more than 500 different companies and over 4,000 different developers getting changes merged into the tree during the past year. As Greg always says every year, the kernel continues to change faster this year than the last, this year we were running around 8.5 changes an hour, with 10,000 lines of code added, 2,000 modified, and 2,500 lines removed every hour of every day.

The stunning growth and widespread adoption of Linux, however, also requires ever evolving methods of achieving compliance with the terms of our community’s chosen license, the GPL-2.0. At this point, there is no lack of clarity on the base compliance expectations of our community. Our goals as an ecosystem are to make sure new participants are made aware of those expectations and the materials available to assist them, and to help them grow into our community.  Some of us spend a lot of time traveling to different companies all around the world doing this, and lots of other people and groups have been working tirelessly to create practical guides for everyone to learn how to use Linux in a way that is compliant with the license. Some of these activities include:

Unfortunately the same processes that we use to assure fulfillment of license obligations and availability of source code can also be used unjustly in trolling activities to extract personal monetary rewards. In particular, issues have arisen as a developer from the Netfilter community, Patrick McHardy, has sought to enforce his copyright claims in secret and for large sums of money by threatening or engaging in litigation. Some of his compliance claims are issues that should and could easily be resolved. However, he has also made claims based on ambiguities in the GPL-2.0 that no one in our community has ever considered part of compliance.  

Examples of these claims have been distributing over-the-air firmware, requiring a cell phone maker to deliver a paper copy of source code offer letter; claiming the source code server must be setup with a download speed as fast as the binary server based on the “equivalent access” language of Section 3; requiring the GPL-2.0 to be delivered in a local language; and many others.

How he goes about this activity was recently documented very well by Heather Meeker.

Numerous active contributors to the kernel community have tried to reach out to Patrick to have a discussion about his activities, to no response. Further, the Netfilter community suspended Patrick from contributing for violations of their principles of enforcement. The Netfilter community also published their own FAQ on this matter.

While the kernel community has always supported enforcement efforts to bring companies into compliance, we have never even considered enforcement for the purpose of extracting monetary gain.  It is not possible to know an exact figure due to the secrecy of Patrick’s actions, but we are aware of activity that has resulted in payments of at least a few million Euros.  We are also aware that these actions, which have continued for at least four years, have threatened the confidence in our ecosystem.

Because of this, and to help clarify what the majority of Linux kernel community members feel is the correct way to enforce our license, the Technical Advisory Board of the Linux Foundation has worked together with lawyers in our community, individual developers, and many companies that participate in the development of, and rely on Linux, to draft a Kernel Enforcement Statement to help address both this specific issue we are facing today, and to help prevent any future issues like this from happening again.

A key goal of all enforcement of the GPL-2.0 license has and continues to be bringing companies into compliance with the terms of the license. The Kernel Enforcement Statement is designed to do just that.  It adopts the same termination provisions we are all familiar with from GPL-3.0 as an Additional Permission giving companies confidence that they will have time to come into compliance if a failure is identified. Their ability to rely on this Additional Permission will hopefully re-establish user confidence and help direct enforcement activity back to the original purpose we have all sought over the years – actual compliance.  

Kernel developers in our ecosystem may put their own acknowledgement to the Statement by sending a patch to Greg adding their name to the Statement, like any other kernel patch submission, and it will be gladly merged. Those authorized to ‘ack’ on behalf of their company may add their company name in (parenthesis) after their name as well.

Note, a number of questions did come up when this was discussed with the kernel developer community. Please see Greg’s FAQ post answering the most common ones if you have further questions about this topic.

October 16, 2017 09:00 AM

Pavel Machek: Help time travelers!

Ok, so I have various machines here. It seems only about half of them has working RTC. That are the boring ones.

And even the boring ones have pretty imprecise RTCs... For example Nokia N9. I only power it up from time to time, I believe it drifts something like minute per month... For normal use with SIM card, it can probably correct from GSM network if you happen to have a cell phone signal, but...

More interesting machines... Old thinkpad is running without CMOS battery. ARM OLPC has _three_ RTCs, but not a single working one. N900 has working RTC but no or dead backup battery. On these, RTC driver probably knows time is not valid, but feeds the garbage into the system time, anyway. Ouch. Neither Sharp Zaurus SL-5500 nor C-3000 had battery backup on RTC...

Even in new end-user machines, time quality varies a lot. "First boot, please enter time" is only accurate to seconds, if the user is careful. RTC is usually not very accurate, either... and noone uses adjtime these days. GSM time and ntpdate are probably accurate to miliseconds, GPS can provide time down to picoseconds... And broken systems are so common "swclock" is available in init system to store time in file, so it at least does not go backwards.

https (and other crypto) depends on time... so it is important to know approximate month we are in.

Is it time we handle it better?

Could we return both time and log2(expected error) from system calls?

That way we could hide the clock in GUI if time is not available or not precise to minutes, ignore certificate dates when time is not precise to months, and you would not have to send me a "Pavel, are you time traveling, again?" message next time my mailer sends email dated to 1970.

October 16, 2017 07:38 AM

October 14, 2017

James Bottomley: Using Elliptic Curve Cryptography with TPM2

One of the most significant advances going from TPM1.2 to TPM2 was the addition of algorithm agility: The ability of TPM2 to work with arbitrary symmetric and asymmetric encryption schemes.  In practice, in spite of this much vaunted agile encryption capability, most actual TPM2 chips I’ve seen only support a small number of asymmetric encryption schemes, usually RSA2048 and a couple of Elliptic Curves.  However, the ability to support any Elliptic Curve at all is a step up from TPM1.2.  This blog post will detail how elliptic curve schemes can be integrated into existing cryptographic systems using TPM2.  However, before we start on the practice, we need at least a tiny swing through the theory of Elliptic Curves.

What is an Elliptic Curve?

An Elliptic Curve (EC) is simply the set of points that lie on the curve in the two dimensional plane (x,y) defined by the equation

y2 = x3 + ax + b

which means that every elliptic curve can be parametrised by two constants a and b.  The set of all points lying on the curve plus a point at infinity is combined with an addition operation to produce an abelian (commutative) group.  The addition property is defined by drawing straight lines between two points and seeing where they intersect the curve (or picking the infinity point if they don’t intersect).  Wikipedia has a nice diagrammatic description of this here.  The infinity point acts as the identity of the addition rule and the whole group is denoted E.

The utility for cryptography is that you can define an integer multiplier operation which is simply the element added to itself n times, so for P ∈ E, you can always find Q ∈ E such that

Q = P + P + P … = n × P

And, since it’s a simple multiplication like operation, it’s very easy to compute Q.  However, given P and Q it is computationally very difficult to get back to n.  In fact, it can be demonstrated mathematically that trying to compute n is equivalent to the discrete logarithm problem which is the mathematical basis for the cryptographic security of RSA.  This also means that EC keys suffer the same (actually more so) problems as RSA keys: they’re not Quantum Computing secure (vulnerable to the Quantum Shor’s algorithm) and they would be instantly compromised if the discrete logarithm problem were ever solved.

Therefore, for any elliptic curve, E, you can choose a known point G ∈ E, select a large integer d and you can compute a point P = d × G.  You can then publish (P, G, E) as your public key knowing it’s computationally infeasible for anyone to derive your private key d.

For instance, Diffie-Hellman key exchange can be done by agreeing (E, G) and getting Alice and Bob to select private keys dA, dB.  Then knowing Bob’s public key PB, Alice can select a random integer r, which she publishes, and compute a key agreement as a secret point on the Elliptic Curve (r dA) × PB.  Bob can derive the same Elliptic Curve point because

(r dA) × PB = (r dA)dB × G = (r dB) dA × G = (r dB) × PA

The agreement is a point on the curve, but you can use an agreed hashing or other mechanism to get from the point to a symmetric key.

Seems simple, but the problem for computing is that we really want to use integers and right at the moment the elliptic curve is defined over all the real numbers, meaning E is of infinite size and involves floating point computations (of rather large precision).

Elliptic Curves over Finite Fields

Fortunately there is a mathematical theory of finite fields, called Galois Theory, which allows us to take the Galois Field over prime number p, which is denoted GF(p), and compute Elliptic Curve points over this field.  This derivation, which is mathematically rather complicated, is denoted E(GF(p)), where every point (x,y) is represented by a pair of integers between 0 and p-1.  There is another theory that says the number of elements in E(GF(p))

n = |E(GF(p))|

is roughly the same size as p, meaning if you choose a 32 bit prime p, you likely have a field over roughly 2^32 elements.  For every point P in E(GF(p)) it is also mathematically proveable that n × P = 0. where 0 is the zero point (which was the infinity point in the real elliptic curve).

This means that you can take any point, G,  in E(GF(p)) and compute a subgroup based on it:

EG = { ∀m ∈ Zn : m × G }

If you’re lucky |EG| = |E(GF(p))| and G is the generator of the entire group.  However, G may only generate a subgroup and you will find |EG| = h|E(GF(p))| where integer h is called the cofactor.  In general you want the cofactor to be small (preferably less than four) for EG to be cryptographically useful.

For a computer’s purposes, EG is the elliptic curve group used for integer arithmetic in the cryptographic algorithms.  The Curve and Generator is then defined by (p, a, b, Gx, Gy, n, h) which are the published parameters of the key (Gx, Gy represent the x and y elements of point G).  You select a random number d as your private key and your public key P = d × G exactly as above, except now P is easy to compute with integer operations.

Problems with Elliptic Curves

Although I stated above that solving P = d × G is equivalent in difficulty to the discrete logarithm problem, that’s not generally true.  If the discrete logarithm problem were solved, then we’d easily be able to compute d for every generator and curve, but it is possible to pick curves for which d can be easily computed without solving the discrete logarithm problem. This is the reason why you should never pick your own curve parameters (even if you think you know what you’re doing) because it’s very easy to choose a compromised curve.  As a demonstration of the difficulty of the problem: each of the major nation state actors, Russia, China and the US, publishes their own curve parameters for use in their own cryptographic EC implementations and each of them thinks the parameters published by the others is compromised in a way that allows the respective national security agencies to derive private keys.  So if nation state actors can’t tell if a curve is compromised or not, you surely won’t be able to either.

Therefore, to be secure in EC cryptography, you pick and existing curve which has been vetted and select some random Generator Point on it.  Of course, if you’re paranoid, that means you won’t be using any of the nation state supplied curves …

Using the TPM2 with Elliptic Curves in Cryptosystems

The initial target for this work was the openssl cryptosystem whose libraries are widely used for deriving other uses (like https in apache or openssh). Originally, when I did the initial TPM2 enabling of openssl as described in this blog post, I added TPM2 as a patch to the existing TPM 1.2 openssl_tpm_engine.  Unfortunately, openssl_tpm_engine seems to be pretty much defunct at this point, so I started my own openssl_tpm2_engine as a separate git tree to begin experimenting with Elliptic Curve keys (if you don’t use git, you can download the tar file here). One of the benefits to running my own source tree is that I can now add a testing infrastructure that makes use of the IBM TPM emulator to make sure that the basic cryptographic operations all work which means that make check functions even when a TPM2 isn’t available.  The current key creation and import algorithms use secured connections to the TPM (to avoid eavesdropping) which means it’s only really possible to construct them using the IBM TSS. To make all of this easier, I’ve set up an openSUSE Build Service repository which is building for all major architectures and the openSUSE and Fedora distributions (ignore the failures, they’re currently induced because the TPM emulator only currently works on 64 bit little endian systems, so make check is failing, but the TPM people at IBM are working on this, so eventually the builds should be complete).

TPM2 itself also has some annoying restrictions.  The biggest of which is that it doesn’t allow you to pass in arbitrary elliptic curve parameters; you may only use elliptic curves which the TPM itself knows.  This will be annoying if you have an existing EC key you’re trying to import because the TPM may reject it as an unknown algorithm.  For instance, openssl can actually compute with arbitrary EC parameters, but has 39 current elliptic curves parametrised by name. By contrast, my Nuvoton TPM2 inside my Dell XPS 13 knows precisely two curves:

jejb@jarvis:~> create_tpm2_key --list-curves

However, assuming you’ve picked a compatible curve for your EC private key (and you’ve defined a parent key for the storage hierarchy) you can simply import it to a TPM bound key:

create_tpm2_key -p 81000001 -w key.priv key.tpm

The tool will report an error if it can’t convert the curve parameters to a named elliptic curve known to the TPM

jejb@jarvis:~> openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:brainpoolP256r1 > key.priv
jejb@jarvis:~> create_tpm2_key -p 81000001 -w key.priv key.tpm
TPM does not support the curve in this EC key
openssl_to_tpm_public failed with 166
TPM_RC_CURVE - curve not supported Handle number unspecified

You can also create TPM resident private keys simply by specifying the algorithm

create_tpm2_key -p 81000001 --ecc bnp256 key.tpm

Once you have your TPM based EC keys, you can use them to create public keys and certificates.  For instance, you create a self-signed X509 certificate based on the tpm key by

openssl req -new -x509 -sha256  -key key.tpm -engine tpm2 -keyform engine -out my.crt

Why you should use EC keys with the TPM

The initial attraction is the same as for RSA keys: making it impossible to extract your private key from the system.  However, the mathematical calculations for EC keys are much simpler than for RSA keys and don’t involve finding strong primes, so it’s much simpler for the TPM (being a fairly weak calculation machine) to derive private and public EC keys.  For instance the times taken to derive a RSA key from the primary seed and an EC key differ dramatically

jejb@jarvis:~> time tsscreateprimary -hi o -ecc bnp256 -st
Handle 80ffffff

real 0m0.111s
user 0m0.000s
sys 0m0.014s

jejb@jarvis:~> time tsscreateprimary -hi o -rsa -st
Handle 80ffffff

real 0m20.473s
user 0m0.015s
sys 0m0.084s

so for a slow system like the TPM, using EC keys is a significant speed advantage.  Additionally, there are other advantages.  The standard EC Key signature algorithm is a modification of the NIST Digital Signature Algorithm called ECDSA.  However DSA and ECDSA require a cryptographically strong (and secret) random number as Sony found out to their cost in the EC Key compromise of Playstation 3.  The TPM is a good source of cryptographically strong random numbers and if it generates the signature internally, you can be absolutely sure of keeping the input random number secret.

Why you might want to avoid EC keys altogether

In spite of the many advantages described above, EC keys suffer one additional disadvantage over RSA keys in that Elliptic Curves in general are very hot fields of mathematical research so even if the curve you use today is genuinely not compromised, it’s not impossible that a mathematical advance tomorrow will make the curve you chose (and thus all the private keys you generated) vulnerable.  Of course, the same goes for RSA if anyone ever cracks the discrete logarithm problem, but solving that problem would likely be fully published to world acclaim and recognition as a significant contribution to the advancement of number theory.  Discovering an attack on a currently used elliptic curve on the other hand might be better remunerated by offering to sell it privately to one of the national security agencies …

October 14, 2017 10:52 PM

October 11, 2017

Paul E. Mc Kenney: Stupid RCU Tricks: In the audience for a pair of RCU talks!

I had the privilege of attending CPPCON last month. Michael Wong, Maged Michael, and I presented a parallel-programming overview, in which I presented the "Hardware and its Habits" chapter of Is Parallel Programming Hard, And, If So, What Can You Do About It?.

But the highlight for me was actually sitting in the audience for a pair of talks by people who had implemented RCU in C++.

Ansel Sermersheim presented a two-part talk entitled Multithreading is the answer. What is the question?. The second part of this talk covered lockless containers, and used a variant of RCU to implement a low-overhead libGuarded facility in order to more easily avoid deadlocks. The implementation is similar to the Linux-kernel real-time RCU implementation by Jim Houston and Joe Korty in that the counterpart to rcu_read_unlock() actively registers a quiescent state. Ansel's implementation goes further by also driving callback invocation from rcu_read_unlock(). Now I don't recommend this for a general-purpose RCU implementation due to the possibility of deadlock should a resource need to be held across rcu_read_unlock() and acquired within the callback. However, this approach should work just fine in the case where the callbacks just free memory and the memory allocator does not contain too many RCU read-side critical sections.

Fedor Pikus presented a talk entitled Read, Copy, Update, then what? RCU for non-kernel programmers, in which he gave a quite-decent introduction to use of RCU. This introduction included an improved version of my long-standing where-to-use-RCU diagram, which I fully intend to incorporate. I had a number of but-you-could moments, including the usual "put the size in with the array" advice, ways of updating things already exposed to readers, and the fact that RCU really can tolerate multiple writers, along with some concerns about counter overflow. Nevertheless, an impressive amount of great information in a one-hour talk!

It is very good to see more people making use of RCU!

October 11, 2017 09:47 PM

October 04, 2017

Dave Airlie (blogspot): radv: a conformant Vulkan driver (with caveats)

If you take a look at the conformant vulkan list, you might see entry 220.

Software in the Public Interest, Inc. 2017-10-04 Vulkan_1_0 220
AMD Radeon R9 285 Intel i5-4460 x86_64 Linux 4.13 DRI3.

 This is radv, and this is the first conformance submission done under the (SPI) membership of the Khronos adopter program.

This submission was a bit of a trial run for radv developers, but Mesa 17.2 + llvm 5.0 on Bas's R9 285 card.

We can extend this submission to cover all VI GPUs.

In practice we pass all the same tests on CIK and Polaris GPUs, but we will have to do complete submission runs on those when we get a chance.

But major milestone/rubberstamp reached, radv is now a conformant Vulkan driver. Thanks go to Bas and all the other contributors and people who's code we've leveraged!

October 04, 2017 07:49 PM

October 02, 2017

James Morris: Linux Security Summit 2017 Roundup

The 2017 Linux Security Summit (LSS) was held last month in Los Angeles over the 14th and 15th of September.  It was co-located with Open Source Summit North America (OSSNA) and the Linux Plumbers Conference (LPC).

LSS 2017 sign at conference

LSS 2017

Once again we were fortunate to have general logistics managed by the Linux Foundation, allowing the program committee to focus on organizing technical content.  We had a record number of submissions this year and accepted approximately one third of them.  Attendance was very strong, with ~160 attendees — another record for the event.

LSS 2017 Attendees

LSS 2017 Attendees

On the day prior to LSS, attendees were able to access a day of LPC, which featured two tracks with a security focus:

Many thanks to the LPC organizers for arranging the schedule this way and allowing LSS folk to attend the day!

Realtime notes were made of these microconfs via etherpad:

I was particularly interested in the topic of better integrating LSM with containers, as there is an increasingly common requirement for nesting of security policies, where each container may run its own apparently independent security policy, and also a potentially independent security model.  I proposed the approach of introducing a security namespace, where all security interfaces within the kernel are namespaced, including LSM.  It would potentially solve the container use-cases, and also the full LSM stacking case championed by Casey Schaufler (which would allow entirely arbitrary stacking of security modules).

This would be a very challenging project, to say the least, and one which is further complicated by containers not being a first class citizen of the kernel.   This leads to security policy boundaries clashing with semantic functional boundaries e.g. what does it mean from a security policy POV when you have namespaced filesystems but not networking?

Discussion turned to the idea that it is up to the vendor/user to configure containers in a way which makes sense for them, and similarly, they would also need to ensure that they configure security policy in a manner appropriate to that configuration.  I would say this means that semantic responsibility is pushed to the user with the kernel largely remaining a set of composable mechanisms, in relation to containers and security policy.  This provides a great deal of flexibility, but requires those building systems to take a great deal of care in their design.

There are still many issues to resolve, both upstream and at the distro/user level, and I expect this to be an active area of Linux security development for some time.  There were some excellent followup discussions in this area, including an approach which constrains the problem space. (Stay tuned)!

A highlight of the TPMs session was an update on the TPM 2.0 software stack, by Philip Tricca and Jarkko Sakkinen.  The slides may be downloaded here.  We should see a vastly improved experience over TPM 1.x with v2.0 hardware capabilities, and the new software stack.  I suppose the next challenge will be TPMs in the post-quantum era?

There were further technical discussions on TPMs and container security during subsequent days at LSS.  Bringing the two conference groups together here made for a very productive event overall.

TPMs microconf at LPC with Philip Tricca presenting on the 2.0 software stack.

This year, due to the overlap with LPC, we unfortunately did not have any LWN coverage.  There are, however, excellent writeups available from attendees:

There were many awesome talks.

The CII Best Practices Badge presentation by David Wheeler was an unexpected highlight for me.  CII refers to the Linux Foundation’s Core Infrastructure Initiative , a preemptive security effort for Open Source.  The Best Practices Badge Program is a secure development maturity model designed to allow open source projects to improve their security in an evolving and measurable manner.  There’s been very impressive engagement with the project from across open source, and I believe this is a critically important effort for security.

CII Bade Project adoption (from David Wheeler’s slides).

During Dan Cashman’s talk on SELinux policy modularization in Android O,  an interesting data point came up:

Interesting data from the talk: 44% of Android kernel vulns blocked by SELinux due to attack surface reduction.

— James Morris (@xjamesmorris) September 15, 2017

We of course expect to see application vulnerability mitigations arising from Mandatory Access Control (MAC) policies (SELinux, Smack, and AppArmor), but if you look closely this refers to kernel vulnerabilities.   So what is happening here?  It turns out that a side effect of MAC policies, particularly those implemented in tightly-defined environments such as Android, is a reduction in kernel attack surface.  It is generally more difficult to reach such kernel vulnerabilities when you have MAC security policies.  This is a side-effect of MAC, not a primary design goal, but nevertheless appears to be very effective in practice!

Another highlight for me was the update on the Kernel Self Protection Project lead by Kees, which is now approaching its 2nd anniversary, and continues the important work of hardening the mainline Linux kernel itself against attack.  I would like to also acknowledge the essential and original research performed in this area by grsecurity/PaX, from which this mainline work draws.

From a new development point of view, I’m thrilled to see the progress being made by Mickaël Salaün, on Landlock LSM, which provides unprivileged sandboxing via seccomp and LSM.  This is a novel approach which will allow applications to define and propagate their own sandbox policies.  Similar concepts are available in other OSs such as OSX (seatbelt) and BSD (pledge).  The great thing about Landlock is its consolidation of two existing Linux kernel security interfaces: LSM and Seccomp.  This ensures re-use of existing mechanisms, and aids usability by utilizing already familiar concepts for Linux users.

Mickaël Salaün from ANSSI talking about his Landlock LSM work at #linuxsecuritysummit 2017

— LinuxSecuritySummit (@LinuxSecSummit) September 14, 2017

Overall I found it to be an incredibly productive event, with many new and interesting ideas arising and lots of great collaboration in the hallway, lunch, and dinner tracks.

Slides from LSS may be found linked to the schedule abstracts.

We did not have a video sponsor for the event this year, and we’ll work on that again for next year’s summit.  We have discussed holding LSS again next year in conjunction with OSSNA, which is expected to be in Vancouver in August.

We are also investigating a European LSS in addition to the main summit for 2018 and beyond, as a way to help engage more widely with Linux security folk.  Stay tuned for official announcements on these!

Thanks once again to the awesome event staff at LF, especially Jillian Hall, who ensured everything ran smoothly.  Thanks also to the program committee who review, discuss, and vote on every proposal, ensuring that we have the best content for the event, and who work on technical planning for many months prior to the event.  And of course thanks to the presenters and attendees, without whom there would literally and figuratively be no event :)

See you in 2018!


October 02, 2017 01:52 AM

September 25, 2017

Pavel Machek: Colorful LEDs

RGB LEDs do not exist according to Linux LED subsystem. They are modeled as three separate LEDs, red, green and blue; that matches the hardware.

Unfortunately, it has problems. Lets begin with inconsistent naming: some drivers use :r suffix, some use :red. There's no explicit grouping of LEDs for one light -- there's no place to store parameters common for the light. (LEDs could be grouped by name.)

RGB colorspace is pretty well defined, and people expect to set specific colors. Unfortunately.... that does not work well with LEDs. First, LEDs are usually not balanced according to human perception system, so full power to the LEDs (255, 255, 255) may not
result in white. Second, monitors normally use gamma correction before displaying color, so (128, 128, 128) does not correspond to 50% of light being produced. But LEDs normally use PWM, so (128, 128, 128) does correspond to 50% light. Result is that colors are completely off.

I tested HSV colorspace for the LEDs. That would have advantage of old triggers being able to use selected colors... Unfortunately, on N900, white color is something like 15% blue, which would result in significantly reducing number of white intensities we can display.

September 25, 2017 08:30 AM

September 19, 2017

Pavel Machek: Unicsy phone

For a long time, I wanted a phone that runs Unix. And I got that, first Android, second Maemo on Nokia N900. With Android I realized that running Linux kernel is not enough. Android is really far away from normal Unix machine, and I'd argue away from anything usable, too. Maemo was slightly closer, and probably could be fixed if it was open-source.

But I realized Linux kernel is not really the most important part. There's more to Unix: compatibility with old apps, small programs where each one does one thing well, data in text formats so you can put them in git. Maemo got some parts right, at least you could run old apps in a useful way; but most important data on the phone (contacts, calendar) were still locked away in sqlite.

And that is something I'd like to change: phone that is ssh-friendly, text-editor-friendly and git-friendly. I call it "Unicsy phone". No, I don't want to do phone `cat addressbook | grep Friend | cut -f 1`... graphical utilities are okay. But console tools still should be there, and file formats should be reasonable.

So there is tui project, and recently postmarketos project appeared. Nokia N900 is mostly supported by mainline kernel (with exceptions of bluetooth and camera, everything works). There's work to be done, but it looks doable.

More is missing in the userspace. Phone parts need work, as expected. What is more surprising... there's emacs org mode, with great calendar capabilities, but I could not find matching application to display data nicely and provide alerts. Situation is even worse for contacts; emacs org can help there, too, but there does not seem to be agreement that this is the way to go. (And again, graphical applications would be nice).

September 19, 2017 10:17 PM

September 16, 2017

Pavel Machek: FlightGear fun

How to die in Boeing 707, quick and easy. Take off, realize that you should set up fuel heating, select Help|, aim for checklists.. and hit auto startup/shutdown. Instantly lose all the engines. Fortunately, you are at 6000', so you start looking for the airport. Then you
realize "hmm, perhaps I can do the startup thing now", and hit the menu item once again. But instead of running engines, you get fire warnings on all the engines. That does not look good. Confirm fire, extinguish all four engines, and resume looking for airport in range. Trim for best glide. Then number 3 comes up. Then number 4. Number one and you know it will be easy. Number two as you fly over the runway... go around and do normal approach.

September 16, 2017 11:41 AM

September 15, 2017

Linux Plumbers Conference: Linux Plumbers Conference Unconference Schedule Announced

Since we only have six proposals, we can schedule them in the unconference session without any need for actual voting at breakfast.  On a purely random basis, the schedule will be:

Unconference I:

09:30 Test driven development (TDD) in the kernel – Knut Omang
11:00 Support for adding DT based thermal zones at runtime – Moritz Fischer
11:50 Restartable Sequences interaction with debugger single-stepping – Mathieu Desnoyers

Unconference II:

14:00 Automated testing of LKML patches with Clang – Nick Desaulniers
14:50 ktask: multithread cpu-intensive kernel work – Daniel Jordan
16:00 Soft Affinity for Workloads – Rohit Jain

I’ll add these to the plumbers schedule (if the author doesn’t already have an account, I’ll show up as the speaker, but please take the above list as definitive for actual speaker).

Looking forward to seeing you all at this exciting new event for Plumbers,

September 15, 2017 04:57 AM

September 14, 2017

Grant Likely: Arcade Panel Construction Time-Lapse Video

September 14, 2017 05:56 PM

Grant Likely: NeoPixel Arcade Buttons

September 14, 2017 05:54 PM

September 13, 2017

Grant Likely: Custom Arcade Control Panels

I’ve started building custom arcade controls for using with classic arcade game emulators. All Open Source and Open Hardware of course, with the source code up on GitHub.

OpenSCAD:Arcade is an arcade panel modeling too written in OpenSCAD. It designs the arcade panel layout and produces lasercutter output and frame dimensions.

STM32F3-Discovery-Arcade is a prototype USB HID device for arcade controls. It currently supports GPIO joysticks and buttons, quadrature trackballs/spinners, and will drive up to 4 channels of NeoPixel RGB LED strings. The project has both custom STM32 firmware and a custom adaptor PCB designed with KiCad.

Please go take a look.

September 13, 2017 11:39 PM

Gustavo F. Padovan: Slides of my talk at Open Source Summit NA

I just delivered a talk today at Open Source Summit NA, here in LA, about everything we’ve been doing to support explicit synchronization on the Media and Graphics pipeline in the kernel. You can find the slides here.

The DRM side is already mainline, but V4L2 is currently my focus of work along with the linux-media community in the kernel. Blog posts about that should appear soon on this blog.

September 13, 2017 07:56 PM

September 11, 2017

Linux Plumbers Conference: New to Plumbers: Unconference on Friday

The hallway track is always a popular feature of Linux Plumbers Conference.  New ideas and solutions emerge all the time.  But sometimes you start a discussion, and want to pull others in before the conference ends, and just can’t quite make it work.

This year, we’re trying an experiment at Linux Plumbers and reserving a room for an unconference session on Friday,  so the ad hoc problem solving sessions for those topics with the most participant interest can be held.

If there is a topic you want to have a 1 hour discussion around,  please put it on the etherpad with:


Topic:  <something short>
Host(s): <person who will host the discussion>
Description:   <describe problem you want to talk about>


We’ll close down the topic page on Thursday night at 8pm,  and print the collected topics out on In the morning and post them in the room.     During the breakfast period (from 8 to 9am), those wanting to participate will be given four dots to vote.  Vote by placing a dot on the topics of interest until 8:45am.   Sessions will be scheduled as the one with the most dots first, and in descending order until we run out of sessions or time.

Schedule will be posted in the room on Friday morning.

September 11, 2017 05:54 PM

September 07, 2017

James Morris: Linux Plumbers Conference Sessions for Linux Security Summit Attendees

Folks attending the 2017 Linux Security Summit (LSS) next week may be also interested in attending the TPMs and Containers sessions at Linux Plumbers Conference (LPC) on the Wednesday.

The LPC TPMs microconf will be held in the morning and lead by Matthew Garret, while the containers microconf will be run by Stéphane Graber in the afternoon.  Several security topics will be discussed in the containers session, including namespacing and stacking of LSM, and namespacing of IMA.

Attendance on the Wednesday for LPC is at no extra cost for registered attendees of LSS.  Many thanks to the LPC organizers for arranging this!

There will be followup BOF sessions on LSM stacking and namespacing at LSS on Thursday, per the schedule.

This should be a very productive week for Linux security development: see you there!

September 07, 2017 01:44 AM

September 06, 2017

Linux Plumbers Conference: Linux Plumbers Conference Preliminary Schedule Published

You can see the schedule by clicking on the ‘schedule’ tab above or by going to this url

If you’d like any changes, please email and we’ll see what we can do to accommodate your request.

Please also remember that the schedule is subject to change.

September 06, 2017 10:01 PM

Greg Kroah-Hartman: 4.14 == This years LTS kernel

As the 4.13 release has now happened, the merge window for the 4.14 kernel release is now open. I mentioned this many weeks ago, but as the word doesn’t seem to have gotten very far based on various emails I’ve had recently, I figured I need to say it here as well.

So, here it is officially, 4.14 should be the next LTS kernel that I’ll be supporting with stable kernel patch backports for at least two years, unless it really is a horrid release and has major problems. If so, I reserve the right to pick a different kernel, but odds are, given just how well our development cycle has been going, that shouldn’t be a problem (although I guess I just doomed it now…)

As always, if people have questions about this, email me and I will be glad to discuss it, or talk to me in person next week at the LinuxCon^WOpenSourceSummit or Plumbers conference in Los Angeles, or at any of the other conferences I’ll be at this year (ELCE, Kernel Recipes, etc.)

September 06, 2017 02:41 PM

September 05, 2017

Kees Cook: security things in Linux v4.13

Previously: v4.12.

Here’s a short summary of some of interesting security things in Sunday’s v4.13 release of the Linux kernel:

security documentation ReSTification
The kernel has been switching to formatting documentation with ReST, and I noticed that none of the Documentation/security/ tree had been converted yet. I took the opportunity to take a few passes at formatting the existing documentation and, at Jon Corbet’s recommendation, split it up between end-user documentation (which is mainly how to use LSMs) and developer documentation (which is mainly how to use various internal APIs). A bunch of these docs need some updating, so maybe with the improved visibility, they’ll get some extra attention.

Since Peter Zijlstra implemented the refcount_t API in v4.11, Elena Reshetova (with Hans Liljestrand and David Windsor) has been systematically replacing atomic_t reference counters with refcount_t. As of v4.13, there are now close to 125 conversions with many more to come. However, there were concerns over the performance characteristics of the refcount_t implementation from the maintainers of the net, mm, and block subsystems. In order to assuage these concerns and help the conversion progress continue, I added an “unchecked” refcount_t implementation (identical to the earlier atomic_t implementation) as the default, with the fully checked implementation now available under CONFIG_REFCOUNT_FULL. The plan is that for v4.14 and beyond, the kernel can grow per-architecture implementations of refcount_t that have performance characteristics on par with atomic_t (as done in grsecurity’s PAX_REFCOUNT).

Daniel Micay created a version of glibc’s FORTIFY_SOURCE compile-time and run-time protection for finding overflows in the common string (e.g. strcpy, strcmp) and memory (e.g. memcpy, memcmp) functions. The idea is that since the compiler already knows the size of many of the buffer arguments used by these functions, it can already build in checks for buffer overflows. When all the sizes are known at compile time, this can actually allow the compiler to fail the build instead of continuing with a proven overflow. When only some of the sizes are known (e.g. destination size is known at compile-time, but source size is only known at run-time) run-time checks are added to catch any cases where an overflow might happen. Adding this found several places where minor leaks were happening, and Daniel and I chased down fixes for them.

One interesting note about this protection is that is only examines the size of the whole object for its size (via __builtin_object_size(..., 0)). If you have a string within a structure, CONFIG_FORTIFY_SOURCE as currently implemented will make sure only that you can’t copy beyond the structure (but therefore, you can still overflow the string within the structure). The next step in enhancing this protection is to switch from 0 (above) to 1, which will use the closest surrounding subobject (e.g. the string). However, there are a lot of cases where the kernel intentionally copies across multiple structure fields, which means more fixes before this higher level can be enabled.

NULL-prefixed stack canary
Rik van Riel and Daniel Micay changed how the stack canary is defined on 64-bit systems to always make sure that the leading byte is zero. This provides a deterministic defense against overflowing string functions (e.g. strcpy), since they will either stop an overflowing read at the NULL byte, or be unable to write a NULL byte, thereby always triggering the canary check. This does reduce the entropy from 64 bits to 56 bits for overflow cases where NULL bytes can be written (e.g. memcpy), but the trade-off is worth it. (Besdies, x86_64’s canary was 32-bits until recently.)

IPC refactoring
Partially in support of allowing IPC structure layouts to be randomized by the randstruct plugin, Manfred Spraul and I reorganized the internal layout of how IPC is tracked in the kernel. The resulting allocations are smaller and much easier to deal with, even if I initially missed a few needed container_of() uses.

randstruct gcc plugin
I ported grsecurity’s clever randstruct gcc plugin to upstream. This plugin allows structure layouts to be randomized on a per-build basis, providing a probabilistic defense against attacks that need to know the location of sensitive structure fields in kernel memory (which is most attacks). By moving things around in this fashion, attackers need to perform much more work to determine the resulting layout before they can mount a reliable attack.

Unfortunately, due to the timing of the development cycle, only the “manual” mode of randstruct landed in upstream (i.e. marking structures with __randomize_layout). v4.14 will also have the automatic mode enabled, which randomizes all structures that contain only function pointers.

A large number of fixes to support randstruct have been landing from v4.10 through v4.13, most of which were already identified and fixed by grsecurity, but many were novel, either in newly added drivers, as whitelisted cross-structure casts, refactorings (like IPC noted above), or in a corner case on ARM found during upstream testing.

One of the issues identified from the Stack Clash set of vulnerabilities was that it was possible to collide stack memory with the highest portion of a PIE program’s text memory since the default ELF_ET_DYN_BASE (the lowest possible random position of a PIE executable in memory) was already so high in the memory layout (specifically, 2/3rds of the way through the address space). Fixing this required teaching the ELF loader how to load interpreters as shared objects in the mmap region instead of as a PIE executable (to avoid potentially colliding with the binary it was loading). As a result, the PIE default could be moved down to ET_EXEC (0x400000) on 32-bit, entirely avoiding the subset of Stack Clash attacks. 64-bit could be moved to just above the 32-bit address space (0x100000000), leaving the entire 32-bit region open for VMs to do 32-bit addressing, but late in the cycle it was discovered that Address Sanitizer couldn’t handle it moving. With most of the Stack Clash risk only applicable to 32-bit, fixing 64-bit has been deferred until there is a way to teach Address Sanitizer how to load itself as a shared object instead of as a PIE binary.

early device randomness
I noticed that early device randomness wasn’t actually getting added to the kernel entropy pools, so I fixed that to improve the effectiveness of the latent_entropy gcc plugin.

That’s it for now; please let me know if I missed anything. As a side note, I was rather alarmed to discover that due to all my trivial ReSTification formatting, and tiny FORTIFY_SOURCE and randstruct fixes, I made it into the most active 4.13 developers list (by patch count) at LWN with 76 patches: a whopping 0.6% of the cycle’s patches. ;)

Anyway, the v4.14 merge window is open!

© 2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

September 05, 2017 11:01 PM

Linux Plumbers Conference: Testing and Fuzzing Microconference Accepted into the Linux Plumbers Conference

We’re pleased to announce that newcomer Microconference Testing and Fuzzing will feature at Plumbers in Los Angeles this year.

The Agenda will feature the three fuzzers used for the Linux Kernel (Trinity, Syzkaller and Perf) along with discussion of formal verification tools, discussion of how to test stable trees, testing frameworks and also a discussion and demonstration of the drm/i915 checkin and test infrastructure.

Additionally, we will hold a session aimed at improving the testing process for linux-stable and distro kernels. Please plan to attend if you have input into how to integrate additional testing and make these kernels more reliable. Participants will include Greg Kroah-Hartman and major distro kernel maintainers.

For more details on this, please see this microconference’s wiki page.

We hope to see you there!

September 05, 2017 08:04 AM