Kernel Planet

March 20, 2018

Davidlohr Bueso: Linux v4.15: Performance Goodies

With the Meltdown and Spectre fiascos, performance isn't a very hot topic at the moment. In fact, with Linux v4.15 released, it is one of the rare times I've seen security win over performance in such a one sided way. Normally security features are tucked away under a kernel config option nobody really uses. Of course the software fixes are also backported in one way or another, so this isn't really specific to the latest kernel release.

All this said, v4.15 came out with a few performance enhancements across subsystems. The following is an unsorted and incomplete list of changes that went in. Note that the term 'performance' can be vague in that some gains in one area can negatively affect another, so take everything with a grain of salt and reach your own conclusions.

epoll: scale nested calls

Nested epolls are necessary to allow semantics where a file descriptor in the epoll interested-list is also an epoll instance. Such calls are not all that common, but some real world applications suffered severe performance issues in that it relied on global spinlocks, acquired throughout the callbacks in the epoll state machine. By removing them, we can speed up adding fds to the instance as well as polling, such that epoll_wait() can improve by 100x, scaling linearly when increasing amounts of cores block an an event.
[Commit 57a173bdf5ba,  37b5e5212a44]

pvspinlock: hybrid fairness paravirt semantics

Locking under virtual environments can be tricky, balancing performance and fairness while avoiding artifacts such as starvation and lock holder/waiter preemption. The current paravirtual queued spinlocks, while free from starvation, can perform less optimally than an unfair lock in guests with CPU over-commitment. With Linux v4.15, guest spinlocks now combine the best of both worlds, with an unfair and a queued mode. The idea is that, upon contention, extend the lock stealing attempt in the slowpath (unfair mode) as long as there are queued MCS waiters present, hence improving performance while avoiding starvation. Kernel build experiments show that as a VM becomes more and more over-committed, the ratio of locks acquired in unfair mode increases.
[Commit 11752adb68a3]

mm,x86: avoid saving/restoring interrupts state in gup

When x86 was converted to use the generic get_user_pages_fast() call a performance regression was introduced at a microbenchmark level. The generic gup function attempts to walk the page tables without acquiring any locks, such as the mmap semaphore. In order to do this, interrupts must be disabled, which is where things went different between the arch-specific and generic flavors. The later must save and restore the current state of interrupt, introducing extra overhead when compared to a simple local_irq_enable/disable().
[Commit 5b65c4677a57]

ipc: scale INFO commands

Any syscall used to get info from sysvipc (such as semctl(IPC_INFO) or shmctl(SHM_INFO)) requires internally computing the last ipc identifier. For cases with large amounts of keys, this operation alone can consume a large amount of cycles as it looked up on-demand, in O(N). In order to make this information available in constant time, we keep track of it whenever a new identifier is added.
[Commit 15df03c87983]

ext4:  improve smp scalability for inode generation

The superblock's inode generation number was currently sequentially increased (from a randomly initialized value) and protected by a spinlock, making the usage pattern quite primitive and not very friendly to workloads that are generating files/inodes concurrently. The inode generation path was optimized to remove the lock altogether and simply rely on prandom_u32() such that a fast/seeded pseudo random-number algorithm is used for computing the i_generation.
[Commit 232530680290]

March 20, 2018 05:37 PM

March 15, 2018

Pete Zaitcev: The more you tighten your grip

Seen at the webpage for RancherOS:

Everything in RancherOS is a Docker container. We accomplish this by launching two instances of Docker. One is what we call System Docker, the first process on the system. All other system services, like ntpd, syslog, and console, are running in Docker containers. System Docker replaces traditional init systems like systemd, and can be used to launch additional system services.

March 15, 2018 10:33 PM

March 13, 2018

Pete Zaitcev: You Are Not Uber: Only Uber Are Uber

Remember how FAA shut down the business of NavWorx, with heavy monetary and loss-of-use consequences for its customers? Imagine receiving a letter from U.S. Government telling you that your car is not compatible with roads, and therefore you are prohibited from continuing to drive it. Someone sure forgot that the power to regulate is the power to destroy. This week, we have this report by IEEE Spectrum:

IEEE Spectrum can reveal that the SpaceBees are almost certainly the first spacecraft from a Silicon Valley startup called Swarm Technologies, currently still in stealth mode. Swarm was founded in 2016 by one engineer who developed a spacecraft concept for Google and another who sold his previous company to Apple. The SpaceBees were built as technology demonstrators for a new space-based Internet of Things communications network.

The only problem is, the Federal Communications Commission (FCC) had dismissed Swarm’s application for its experimental satellites a month earlier, on safety grounds.

On Wednesday, the FCC sent Swarm a letter revoking its authorization for a follow-up mission with four more satellites, due to launch next month. A pending application for a large market trial of Swarm’s system with two Fortune 100 companies could also be in jeopardy.

Swarm Technologies, based in Menlo Park, Calif., is the brainchild of two talented young aerospace engineers. Sara Spangelo, its CEO, is a Canadian who worked at NASA’s Jet Propulsion Laboratory, before moving to Google in 2016. Spangelo’s astronaut candidate profile at the Canadian Space Agency says that while at Google, she led a team developing a spacecraft concept for its moonshot X division, including both technical and market analyses.

Swarm CFO Benjamin Longmier has an equally impressive resume. In 2015, he sold his near-space balloon company Aether Industries to Apple, before taking a teaching post at the University of Michigan. He is also co-founder of Apollo Fusion, a company producing an innovative electric propulsion system for satellites.

Although a leading supplier in its market, NavWorx was a bit player at the government level. Not that many people have small private airplanes anymore. But Swarm operates at a different level, an may be able to grease a enough palms in the Washington, D.C., enough to survive this debacle. Or, they may reconstitute as a notionally new company, then claim a clean start. Again unlike the NavWorx, there's no installed base.

March 13, 2018 03:45 PM

March 11, 2018

Greg Kroah-Hartman: My affidavit in the Geniatech vs. McHardy case

As many people know, last week there was a court hearing in the Geniatech vs. McHardy case. This was a case brought claiming a license violation of the Linux kernel in Geniatech devices in the German court of OLG Cologne.

Harald Welte has written up a wonderful summary of the hearing, I strongly recommend that everyone go read that first.

In Harald’s summary, he refers to an affidavit that I provided to the court. Because the case was withdrawn by McHardy, my affidavit was not entered into the public record. I had always assumed that my affidavit would be made public, and since I have had a number of people ask me about what it contained, I figured it was good to just publish it for everyone to be able to see it.

There are some minor edits from what was exactly submitted to the court such as the side-by-side German translation of the English text, and some reformatting around some footnotes in the text, because I don’t know how to do that directly here, and they really were not all that relevant for anyone who reads this blog. Exhibit A is also not reproduced as it’s just a huge list of all of the kernel releases in which I felt that were no evidence of any contribution by Patrick McHardy.


I, the undersigned, Greg Kroah-Hartman,
declare in lieu of an oath and in the
knowledge that a wrong declaration in
lieu of an oath is punishable, to be
submitted before the Court:

I. With regard to me personally:

1. I have been an active contributor to
   the Linux Kernel since 1999.

2. Since February 1, 2012 I have been a
   Linux Foundation Fellow.  I am currently
   one of five Linux Foundation Fellows
   devoted to full time maintenance and
   advancement of Linux. In particular, I am
   the current Linux stable Kernel maintainer
   and manage the stable Kernel releases. I
   am also the maintainer for a variety of
   different subsystems that include USB,
   staging, driver core, tty, and sysfs,
   among others.

3. I have been a member of the Linux
   Technical Advisory Board since 2005.

4. I have authored two books on Linux Kernel
   development including Linux Kernel in a
   Nutshell (2006) and Linux Device Drivers
   (co-authored Third Edition in 2009.)

5. I have been a contributing editor to Linux
   Journal from 2003 - 2006.

6. I am a co-author of every Linux Kernel
   Development Report. The first report was
   based on my Ottawa Linux Symposium keynote
   in 2006, and the report has been published
   every few years since then. I have been
   one of the co-author on all of them. This
   report includes a periodic in-depth
   analysis of who is currently contributing
   to Linux. Because of this work, I have an
   in-depth knowledge of the various records
   of contributions that have been maintained
   over the course of the Linux Kernel

   For many years, Linus Torvalds compiled a
   list of contributors to the Linux kernel
   with each release. There are also usenet
   and email records of contributions made
   prior to 2005. In April of 2005, Linus
   Torvalds created a program now known as
   “Git” which is a version control system
   for tracking changes in computer files and
   coordinating work on those files among
   multiple people. Every Git directory on
   every computer contains an accurate
   repository with complete history and full
   version tracking abilities.  Every Git
   directory captures the identity of
   contributors.  Development of the Linux
   kernel has been tracked and managed using
   Git since April of 2005.

   One of the findings in the report is that
   since the 2.6.11 release in 2005, a total
   of 15,637 developers have contributed to
   the Linux Kernel.

7. I have been an advisor on the Cregit
   project and compared its results to other
   methods that have been used to identify
   contributors and contributions to the
   Linux Kernel, such as a tool known as “git
   blame” that is used by developers to
   identify contributions to a git repository
   such as the repositories used by the Linux
   Kernel project.

8. I have been shown documents related to
   court actions by Patrick McHardy to
   enforce copyright claims regarding the
   Linux Kernel. I have heard many people
   familiar with the court actions discuss
   the cases and the threats of injunction
   McHardy leverages to obtain financial
   settlements. I have not otherwise been
   involved in any of the previous court

II. With regard to the facts:

1. The Linux Kernel project started in 1991
   with a release of code authored entirely
   by Linus Torvalds (who is also currently a
   Linux Foundation Fellow).  Since that time
   there have been a variety of ways in which
   contributions and contributors to the
   Linux Kernel have been tracked and
   identified. I am familiar with these

2. The first record of any contribution
   explicitly attributed to Patrick McHardy
   to the Linux kernel is April 23, 2002.
   McHardy’s last contribution to the Linux
   Kernel was made on November 24, 2015.

3. The Linux Kernel 2.5.12 was released by
   Linus Torvalds on April 30, 2002.

4. After review of the relevant records, I
   conclude that there is no evidence in the
   records that the Kernel community relies
   upon to identify contributions and
   contributors that Patrick McHardy made any
   code contributions to versions of the
   Linux Kernel earlier than 2.4.18 and
   2.5.12. Attached as Exhibit A is a list of
   Kernel releases which have no evidence in
   the relevant records of any contribution
   by Patrick McHardy.

March 11, 2018 01:51 AM

March 07, 2018

Dave Airlie (blogspot): radv - Vulkan 1.1 conformant on launch day

Vulkan 1.1 was officially released today, and thanks to a big effort by Bas and a lot of shared work from the Intel anv developers, radv is a launch day conformant implementation.

is a link to the conformance results. This is also radv's first time to be officially conformant on Vega GPUs.
is the patch series, it requires a bunch of common anv patches to land first. This stuff should all be landing in Mesa shortly or most likely already will have by the time you read this.

In order to advertise 1.1 you need at least a 4.15 Linux kernel.

Thanks to the all involved in making this happen, including the behind the scenes effort to allow radv to participate in the launch day!

March 07, 2018 07:13 PM

March 04, 2018

Pete Zaitcev: MITM in Ireland

I'm just back from OpenStack PTG (Project Technical Gathering) in Dublin, Ireland and while I was there, Firefox reported wrong TLS certificates for some obscure websites, although not others. Example: retains old certificate, as does But goes bad. I presume that Irish authorities and/or ISPs deemed it proper to MITM these sites. The question is, why such a strange choice of targets?

The is a free speech and discussion site, named, as much as I can tell, after an old (possibly classic or memetic) Wondermark cartoon. Maybe the Irish just hate the free speech.

Or, they do not MITM sites that have TLS settings that are too easy to break... and Gmail.

March 04, 2018 07:14 AM

February 21, 2018

Paul E. Mc Kenney: Exit Libris

I have only so many bookshelves, and I have not yet bought into ereaders, so from time to time books must leave. Here is the current batch:

It is a bit sad to abandon some old friends, but such is life with physical books!

February 21, 2018 05:06 AM

February 18, 2018

Linux Plumbers Conference: Summary of Survey Results – Thanks to all those who responded

Thank you to everyone who participated in the survey after Linux Plumbers in 2017,  we had 134 responses to it which, given the total number of conference participants of around 354, has provided confidence in the feedback trends.

Overall – 85% of respondents were positive about the event,  with only 2% actually saying they were dissatisfied.    Co-locating with Open Source Summit did not provide as much benefit as locating with the Kernel Summit in the past, so we will be co-locating with Kernel Summit in 2018.    This preference was also echoed in the write-in comments.   Conference participation was down from 2016,  but adding back the Kernel Summit colocation should address this.

On a positive note, the wireless woes of 2016 were resolved, and survey feedback indicated satisfaction in this area.   Also, folks have let us know that they were able to hear better in the rooms this time and follow the conversations – the throwable microphones were helpful here.  53% felt the conference size was about right, with 45% wanting more to be able to attend.

Communication – People generally approved of the communication from the committee (we didn’t spam you too much), and you were able to find the talks you wanted to attend.   The authors and miniconf leads that responded, followed the trend.

Venue – From the feedback,  we got the clear signal, that smaller venues like Santa Fe are preferred.   For 2018, Plumbers will be held in Vancouver, Canada,  where we’ll have a floor dedicated to us.    From your feedback,  we got wireless, power plug access, hacking space areas right this year, but had problems with on-site catering taking the break beverages and snacks away too soon.   The use of meal cards continues to be very popular, and the catering at the off-site events was well received and appreciated.

Events – The Closing Plenary was generally well received.  Some individuals didn’t find the lightning summaries at the closing that useful, but overall the survey feedback for those responding was either neutral or positive (less than 5% negative), similar to 2016.   We’re looking into the feasibility of some of the suggestions from the written comments to try to improve the closing summary further.   There were several compliments that came through on our evening events, and again the overall feedback provided was very positive.

Location –  Respondents were very positive about the convenience of having the hotel as the conference site,  and were able to use the negotiated rates.   They were more neutral about the choice of LA for the event (some liking it, some not).

Sessions – Of the sessions,  the hallway track continues to remain the most popular and well attended.   There was a very positive response to most of the miniconfs and talks; the refereed track running in parallel was popular.  Our experiment of using part of the time for an unconference was generally well received by those participating, but the write-in comments have some good suggestions for improving this.   Similarly making the schedule visible before the early registration closes is something that attendees want to see.   Keeping the focus on solving problems rather than presenting status is something we have improved on, and will continue to emphasize for next year.

There were lots of great suggestions in the “what one thing would you like to see changed”, and the program committee has been studying them to see what is possible to implement this year.    Thank you again to the participants for their input and help on making the Linux Plumbers Conference better in 2018 and the future.

February 18, 2018 08:07 PM

February 16, 2018

Pete Zaitcev: ARM servers apparently exist at last

Check out what I found at Pogo Linux (h/t Bryan Lunduke):

ARM R150-T62
2 x Cavium® ThunderX™ 48 Core ARM processors
16 x DDR4 DIMM slots
3 x 40GbE QSFP+ LAN ports
4 x 10GbE SFP+ LAN ports
4 x 3.5” hot-swappable HDD/SSD bays
650W 80 PLUS Platinum redundant PSU

The prices are ridiculouts, but at least it's a server with CentOS.

February 16, 2018 06:42 AM

Dave Airlie (blogspot): virgl caps - oops I messed.up

When I designed virgl I added a capability system to pass some info about the host GL to the guest driver along the lines of gallium caps. The design was at the virtio GPU level you have a number of capsets each of which has a max version and max size.

The virgl capset is capset 1 with max version 1 and size 308 bytes.

Until now we've happily been using version 1 at 308 bytes. Recently we decided we wanted to have a v2 at 380 bytes, and the world fell apart.

It turned out there is a bug in the guest kernel driver, it asks the host for a list of capsets and allows guest userspace to retrieve from it. The guest userspace has it's own copy of the struct.

The flow is:
Guest mesa driver gives kernel a caps struct to fill out for capset 1.
Kernel driver asks the host over virtio for latest capset 1 info, max size, version.
Host gives it the max_size, version for capset 1.
Kernel driver asks host to fill out malloced memory of the max_size with the
caps struct.
Kernel driver copies the returned caps struct to userspace, using the size of the returned host struct.

The bug is the last line, it uses the size of the returned host struct which ends up corrupting the guest in the scenario where the host has a capset 1 v2, size 380, but the host is still running old userspace which understands capset v1, size 308.

The 380 bytes gets memcpy over the 308 byte struct and boom.

Now we can fix the kernel to not do this, but we can't upgrade every kernel in an existing VM. So if we allow the virglrenderer process to expose a v2 all older sw will explode unless it is also upgraded which isn't really something you want in a VM world.

I came up with some virglrenderer workarounds, but due to another bug where qemu doesn't reset virglrenderer when it should, there was no way to make it reliable, and things like kexec old kernel from new kernel would blow up.

I decided in the end to bite the bullet and just make capset 2 be a repaired one. Unfortunately this needs patches in all 4 components before it can be used.

1) virglrenderer needs to expose capset 2 with the new version/size to qemu.
2) qemu needs to allow the virtio-gpu to transfer capset 2 as a virgl capset to the host.
3) The kernel on the host needs fixing to make sure we copy the minimum of the host caps and the guest caps into the guest userspace driver, then it needs to
provide a way that guest userspace knows the fixed version is in place.
4) The guest userspace needs to check if the guest kernel has the fix, and then query capset 2 first, and fallback to querying capset 1.

After talking to a few other devs in virgl land, they pointed out we could probably just never add a new version of capset 2, and grow the struct endlessly.

The guest driver would fill out the struct it wants to use with it's copy of default minimum values.
It would then call the kernel ioctl to copy over the host caps. The kernel ioctl would copy the minimum size of the host caps and the guest caps.

In this case if the host has a 400 byte capset 2, and the guest still only has 380 byte capset 2, the new fields from the host won't get copied into the guest struct
and it will be fine.

If the guest has the 400 byte capset 2, but the host only has the 380 byte capset 2, the guest would preinit the extra 20 bytes with it's default values (0 or whatever) and the kernel would only copy 380 bytes into the start of the 400 bytes and leave the extra bytes alone.

Now I just have to got write the patches and confirm it all.

Thanks to Stephane at google for creating the patch that showed how broken it was, and to others in the virgl community who noticed how badly it broke old guests! Now to go write the patches...

February 16, 2018 12:11 AM

February 14, 2018

Pete Zaitcev: More system administration in the age of SystemD

I'm tinkering with OpenStack TripleO in a simulated environment. It uses a dedicated non-privileged user, "stack", which can do things such as list VMs with "virsh list". So, yesterday I stopped the undercloud VM, and went to sleep. Today, I want to restart it... but virsh says:

error: failed to connect to the hypervisor
error: Cannot create user runtime directory '/run/user/1000/libvirt': Permission denied

What seems to happen is that when one logs into the stack@ user over ssh, systemd-logind mounts that /run/user/UID thing, but if I log as zaitcev@ and then do "su - stack", this fails to occur.

I have no idea what to do about this. It's probably trivial for someone more knowledgeable to throw the right pam_systemd line into /etc/pam.d/su. But su-l includes system-auth, which invokes, and yet... Oh well.

February 14, 2018 11:23 PM

February 06, 2018

Eric Sandeen: LEAF battery replacement update

New LEAF battery

Just a quick note here – the LEAF battery did finally go under warranty on Sept 24, 2017, and I got it replaced with minimal hassle back in great shape on October 3.  The LeafSPY stats on the new battery actually dropped fairly quickly after I got it which was worrisome, but now (in the very cold weather) it’s holding steady at about 97% state of health, with 62.3Ahr and 90.35Hx.

The stats when it finally dropped the 9th bar were:

Miles: 40623
Ahr: 43.51
Hx: 45.25

I’ve definitely needed that fresh capacity for this harsh winter, it’s been fine, but frigid mornings still show the Guess-o-Meter at as low as 50-60 miles at times.

February 06, 2018 08:25 PM

February 05, 2018

Greg Kroah-Hartman: Linux Kernel Release Model


This post is based on a whitepaper I wrote at the beginning of 2016 to be used to help many different companies understand the Linux kernel release model and encourage them to start taking the LTS stable updates more often. I then used it as a basis of a presentation I gave at the Linux Recipes conference in September 2017 which can be seen here.

With the recent craziness of Meltdown and Spectre , I’ve seen lots of things written about how Linux is released and how we handle handles security patches that are totally incorrect, so I figured it is time to dust off the text, update it in a few places, and publish this here for everyone to benefit from.

I would like to thank the reviewers who helped shape the original whitepaper, which has helped many companies understand that they need to stop “cherry picking” random patches into their device kernels. Without their help, this post would be a total mess. All problems and mistakes in here are, of course, all mine. If you notice any, or have any questions about this, please let me know.


This post describes how the Linux kernel development model works, what a long term supported kernel is, how the kernel developers approach security bugs, and why all systems that use Linux should be using all of the stable releases and not attempting to pick and choose random patches.

Linux Kernel development model

The Linux kernel is the largest collaborative software project ever. In 2017, over 4,300 different developers from over 530 different companies contributed to the project. There were 5 different releases in 2017, with each release containing between 12,000 and 14,500 different changes. On average, 8.5 changes are accepted into the Linux kernel every hour, every hour of the day. A non-scientific study (i.e. Greg’s mailbox) shows that each change needs to be submitted 2-3 times before it is accepted into the kernel source tree due to the rigorous review and testing process that all kernel changes are put through, so the engineering effort happening is much larger than the 8 changes per hour.

At the end of 2017 the size of the Linux kernel was just over 61 thousand files consisting of 25 million lines of code, build scripts, and documentation (kernel release 4.14). The Linux kernel contains the code for all of the different chip architectures and hardware drivers that it supports. Because of this, an individual system only runs a fraction of the whole codebase. An average laptop uses around 2 million lines of kernel code from 5 thousand files to function properly, while the Pixel phone uses 3.2 million lines of kernel code from 6 thousand files due to the increased complexity of a SoC.

Kernel release model

With the release of the 2.6 kernel in December of 2003, the kernel developer community switched from the previous model of having a separate development and stable kernel branch, and moved to a “stable only” branch model. A new release happened every 2 to 3 months, and that release was declared “stable” and recommended for all users to run. This change in development model was due to the very long release cycle prior to the 2.6 kernel (almost 3 years), and the struggle to maintain two different branches of the codebase at the same time.

The numbering of the kernel releases started out being 2.6.x, where x was an incrementing number that changed on every release The value of the number has no meaning, other than it is newer than the previous kernel release. In July 2011, Linus Torvalds changed the version number to 3.x after the 2.6.39 kernel was released. This was done because the higher numbers were starting to cause confusion among users, and because Greg Kroah-Hartman, the stable kernel maintainer, was getting tired of the large numbers and bribed Linus with a fine bottle of Japanese whisky.

The change to the 3.x numbering series did not mean anything other than a change of the major release number, and this happened again in April 2015 with the movement from the 3.19 release to the 4.0 release number. It is not remembered if any whisky exchanged hands when this happened. At the current kernel release rate, the number will change to 5.x sometime in 2018.

Stable kernel releases

The Linux kernel stable release model started in 2005, when the existing development model of the kernel (a new release every 2-3 months) was determined to not be meeting the needs of most users. Users wanted bugfixes that were made during those 2-3 months, and the Linux distributions were getting tired of trying to keep their kernels up to date without any feedback from the kernel community. Trying to keep individual kernels secure and with the latest bugfixes was a large and confusing effort by lots of different individuals.

Because of this, the stable kernel releases were started. These releases are based directly on Linus’s releases, and are released every week or so, depending on various external factors (time of year, available patches, maintainer workload, etc.)

The numbering of the stable releases starts with the number of the kernel release, and an additional number is added to the end of it.

For example, the 4.9 kernel is released by Linus, and then the stable kernel releases based on this kernel are numbered 4.9.1, 4.9.2, 4.9.3, and so on. This sequence is usually shortened with the number “4.9.y” when referring to a stable kernel release tree. Each stable kernel release tree is maintained by a single kernel developer, who is responsible for picking the needed patches for the release, and doing the review/release process. Where these changes are found is described below.

Stable kernels are maintained for as long as the current development cycle is happening. After Linus releases a new kernel, the previous stable kernel release tree is stopped and users must move to the newer released kernel.

Long-Term Stable kernels

After a year of this new stable release process, it was determined that many different users of Linux wanted a kernel to be supported for longer than just a few months. Because of this, the Long Term Supported (LTS) kernel release came about. The first LTS kernel was 2.6.16, released in 2006. Since then, a new LTS kernel has been picked once a year. That kernel will be maintained by the kernel community for at least 2 years. See the next section for how a kernel is chosen to be a LTS release.

Currently the LTS kernels are the 4.4.y, 4.9.y, and 4.14.y releases, and a new kernel is released on average, once a week. Along with these three kernel releases, a few older kernels are still being maintained by some kernel developers at a slower release cycle due to the needs of some users and distributions.

Information about all long-term stable kernels, who is in charge of them, and how long they will be maintained, can be found on the release page.

LTS kernel releases average 9-10 patches accepted per day, while the normal stable kernel releases contain 10-15 patches per day. The number of patches fluctuates per release given the current time of the corresponding development kernel release, and other external variables. The older a LTS kernel is, the less patches are applicable to it, because many recent bugfixes are not relevant to older kernels. However, the older a kernel is, the harder it is to backport the changes that are needed to be applied, due to the changes in the codebase. So while there might be a lower number of overall patches being applied, the effort involved in maintaining a LTS kernel is greater than maintaining the normal stable kernel.

Choosing the LTS kernel

The method of picking which kernel the LTS release will be, and who will maintain it, has changed over the years from an semi-random method, to something that is hopefully more reliable.

Originally it was merely based on what kernel the stable maintainer’s employer was using for their product (2.6.16.y and 2.6.27.y) in order to make the effort of maintaining that kernel easier. Other distribution maintainers saw the benefit of this model and got together and colluded to get their companies to all release a product based on the same kernel version without realizing it (2.6.32.y). After that was very successful, and allowed developers to share work across companies, those companies decided to not do that anymore, so future LTS kernels were picked on an individual distribution’s needs and maintained by different developers (3.0.y, 3.2.y, 3.12.y, 3.16.y, and 3.18.y) creating more work and confusion for everyone involved.

This ad-hoc method of catering to only specific Linux distributions was not beneficial to the millions of devices that used Linux in an embedded system and were not based on a traditional Linux distribution. Because of this, Greg Kroah-Hartman decided that the choice of the LTS kernel needed to change to a method in which companies can plan on using the LTS kernel in their products. The rule became “one kernel will be picked each year, and will be maintained for two years.” With that rule, the 3.4.y, 3.10.y, and 3.14.y kernels were picked.

Due to a large number of different LTS kernels being released all in the same year, causing lots of confusion for vendors and users, the rule of no new LTS kernels being based on an individual distribution’s needs was created. This was agreed upon at the annual Linux kernel summit and started with the 4.1.y LTS choice.

During this process, the LTS kernel would only be announced after the release happened, making it hard for companies to plan ahead of time what to use in their new product, causing lots of guessing and misinformation to be spread around. This was done on purpose as previously, when companies and kernel developers knew ahead of time what the next LTS kernel was going to be, they relaxed their normal stringent review process and allowed lots of untested code to be merged (2.6.32.y). The fallout of that mess took many months to unwind and stabilize the kernel to a proper level.

The kernel community discussed this issue at its annual meeting and decided to mark the 4.4.y kernel as a LTS kernel release, much to the surprise of everyone involved, with the goal that the next LTS kernel would be planned ahead of time to be based on the last kernel release of 2016 in order to provide enough time for companies to release products based on it in the next holiday season (2017). This is how the 4.9.y and 4.14.y kernels were picked as the LTS kernel releases.

This process seems to have worked out well, without many problems being reported against the 4.9.y tree, despite it containing over 16,000 changes, making it the largest kernel to ever be released.

Future LTS kernels should be planned based on this release cycle (the last kernel of the year). This should allow SoC vendors to plan ahead on their development cycle to not release new chipsets based on older, and soon to be obsolete, LTS kernel versions.

Stable kernel patch rules

The rules for what can be added to a stable kernel release have remained almost identical for the past 12 years. The full list of the rules for patches to be accepted into a stable kernel release can be found in the Documentation/process/stable_kernel_rules.rst kernel file and are summarized here. A stable kernel change:

The last rule, “a change must be in Linus’s tree”, prevents the kernel community from losing fixes. The community never wants a fix to go into a stable kernel release that is not already in Linus’s tree so that anyone who upgrades should never see a regression. This prevents many problems that other projects who maintain a stable and development branch can have.

Kernel Updates

The Linux kernel community has promised its userbase that no upgrade will ever break anything that is currently working in a previous release. That promise was made in 2007 at the annual Kernel developer summit in Cambridge, England, and still holds true today. Regressions do happen, but those are the highest priority bugs and are either quickly fixed, or the change that caused the regression is quickly reverted from the Linux kernel tree.

This promise holds true for both the incremental stable kernel updates, as well as the larger “major” updates that happen every three months.

The kernel community can only make this promise for the code that is merged into the Linux kernel tree. Any code that is merged into a device’s kernel that is not in the releases is unknown and interactions with it can never be planned for, or even considered. Devices based on Linux that have large patchsets can have major issues when updating to newer kernels, because of the huge number of changes between each release. SoC patchsets are especially known to have issues with updating to newer kernels due to their large size and heavy modification of architecture specific, and sometimes core, kernel code.

Most SoC vendors do want to get their code merged upstream before their chips are released, but the reality of project-planning cycles and ultimately the business priorities of these companies prevent them from dedicating sufficient resources to the task. This, combined with the historical difficulty of pushing updates to embedded devices, results in almost all of them being stuck on a specific kernel release for the entire lifespan of the device.

Because of the large out-of-tree patchsets, most SoC vendors are starting to standardize on using the LTS releases for their devices. This allows devices to receive bug and security updates directly from the Linux kernel community, without having to rely on the SoC vendor’s backporting efforts, which traditionally are very slow to respond to problems.

It is encouraging to see that the Android project has standardized on the LTS kernels as a “minimum kernel version requirement”. Hopefully that will allow the SoC vendors to continue to update their device kernels in order to provide more secure devices for their users.


When doing kernel releases, the Linux kernel community almost never declares specific changes as “security fixes”. This is due to the basic problem of the difficulty in determining if a bugfix is a security fix or not at the time of creation. Also, many bugfixes are only determined to be security related after much time has passed, so to keep users from getting a false sense of security by not taking patches, the kernel community strongly recommends always taking all bugfixes that are released.

Linus summarized the reasoning behind this behavior in an email to the Linux Kernel mailing list in 2008:

On Wed, 16 Jul 2008, wrote:
> you should check out the last few -stable releases then and see how
> the announcement doesn't ever mention the word 'security' while fixing
> security bugs

Umm. What part of "they are just normal bugs" did you have issues with?

I expressly told you that security bugs should not be marked as such,
because bugs are bugs.

> in other words, it's all the more reason to have the commit say it's
> fixing a security issue.


> > I'm just saying that why mark things, when the marking have no meaning?
> > People who believe in them are just _wrong_.
> what is wrong in particular?

You have two cases:

 - people think the marking is somehow trustworthy.

   People are WRONG, and are misled by the partial markings, thinking that
   unmarked bugfixes are "less important". They aren't.

 - People don't think it matters

   People are right, and the marking is pointless.

In either case it's just stupid to mark them. I don't want to do it,
because I don't want to perpetuate the myth of "security fixes" as a
separate thing from "plain regular bug fixes".

They're all fixes. They're all important. As are new features, for that

> when you know that you're about to commit a patch that fixes a security
> bug, why is it wrong to say so in the commit?

It's pointless and wrong because it makes people think that other bugs
aren't potential security fixes.

What was unclear about that?


This email can be found here, and the whole thread is recommended reading for anyone who is curious about this topic.

When security problems are reported to the kernel community, they are fixed as soon as possible and pushed out publicly to the development tree and the stable releases. As described above, the changes are almost never described as a “security fix”, but rather look like any other bugfix for the kernel. This is done to allow affected parties the ability to update their systems before the reporter of the problem announces it.

Linus describes this method of development in the same email thread:

On Wed, 16 Jul 2008, wrote:
> we went through this and you yourself said that security bugs are *not*
> treated as normal bugs because you do omit relevant information from such
> commits

Actually, we disagree on one fundamental thing. We disagree on
that single word: "relevant".

I do not think it's helpful _or_ relevant to explicitly point out how to
tigger a bug. It's very helpful and relevant when we're trying to chase
the bug down, but once it is fixed, it becomes irrelevant.

You think that explicitly pointing something out as a security issue is
really important, so you think it's always "relevant". And I take mostly
the opposite view. I think pointing it out is actually likely to be

For example, the way I prefer to work is to have people send me and the
kernel list a patch for a fix, and then in the very next email send (in
private) an example exploit of the problem to the security mailing list
(and that one goes to the private security list just because we don't want
all the people at universities rushing in to test it). THAT is how things
should work.

Should I document the exploit in the commit message? Hell no. It's
private for a reason, even if it's real information. It was real
information for the developers to explain why a patch is needed, but once
explained, it shouldn't be spread around unnecessarily.


Full details of how security bugs can be reported to the kernel community in order to get them resolved and fixed as soon as possible can be found in the kernel file Documentation/admin-guide/security-bugs.rst

Because security bugs are not announced to the public by the kernel team, CVE numbers for Linux kernel-related issues are usually released weeks, months, and sometimes years after the fix was merged into the stable and development branches, if at all.

Keeping a secure system

When deploying a device that uses Linux, it is strongly recommended that all LTS kernel updates be taken by the manufacturer and pushed out to their users after proper testing shows the update works well. As was described above, it is not wise to try to pick and choose various patches from the LTS releases because:

Note, this author has audited many SoC kernel trees that attempt to cherry-pick random patches from the upstream LTS releases. In every case, severe security fixes have been ignored and not applied.

As proof of this, I demoed at the Kernel Recipes talk referenced above how trivial it was to crash all of the latest flagship Android phones on the market with a tiny userspace program. The fix for this issue was released 6 months prior in the LTS kernel that the devices were based on, however none of the devices had upgraded or fixed their kernels for this problem. As of this writing (5 months later) only two devices have fixed their kernel and are now not vulnerable to that specific bug.

February 05, 2018 05:13 PM

February 04, 2018

Pete Zaitcev: Farewell Nexus 7, Hello Huawei M3

Flying a photoshoot of the Carlson, I stuffed my Nexus 7 under my thighs and cracked the screen. In my defense, I did it several times before, because I hate leaving it on the cockpit floor. I had to fly uncoordinated for the photoshoot, which causes anything that's not fixed in place slide around, and I'm paranoid about a controls interference. Anyway, the cracked screen caused a significant dead zone where touch didn't register anymore, and that made the tablet useless. I had to replace it.

In the years since I had the Nexus (apparently since 2014), the industry stopped making good 7-inch tablets. Well, you can still buy $100 tablets in that size. But because the Garmin Pilot was getting spec-hungry recently, I had no choice but to step up. Sad, really. Naturally, I'm having trouble fitting the M3 into pockets where Nexus lived comfortably before. {It's a full-size iPad in the picture, not a Mini.}

The most annoying problem that I encountered was Chrome not liking the SSL certificate of It bails with ERR_SSL_SERVER_CERT_BAD_FORMAT. I have my own fake CA, so I install my CA certificate on clients and I sign my hosts. I accept the consequences and inconventice. The annoyance arises because Chrome does not tell what it does not like about the certificate. Firefox works fine with it, as do other applications (like IMAP clients). Chrome in the Nexus worked fine. A cursory web search suggests that Chrome may want alternative names keyed with "DNS.1" instead of "DNS". Dunno what it means and if it is true.

UPDATE: "Top FBI, CIA, and NSA officials all agree: Stay away from Huawei phones"

February 04, 2018 05:17 AM

February 02, 2018

Michael Kerrisk (manpages): man-pages-4.15 is released

I've released man-pages-4.15. The release tarball is available on The browsable online pages can be found on The Git repository for man-pages is available on

This release resulted from patches, bug reports, reviews, and comments from 26 contributors. Just over 200 commits changed around 75 pages. In addition, 3 new manual pages were added.

Among the more significant changes in man-pages-4.15 are the following:

February 02, 2018 03:21 PM

Daniel Vetter: LCA Sydney: Burning Down the Castle

I’ve done a talk about the kernel community. It’s a hot take, but with the feedback I’ve received thus far I think it was on the spot, and started a lot of uncomfortable, but necessary discussion. I don’t think it’s time yet to give up on this project, even if it will take years.

Without further ado the recording of my talk “Burning Down the Castle is on youtueb”. For those who prefer reading, LWN has you covered with “Too many lords, not enough stewards”. I think Jake Edge and Jon Corbet have done an excellent job in capturing my talk in a balanced fashion.

Further Discussion

For understanding abuse dynamics I can’t recommend “Why Does He Do That?: Inside the Minds of Angry and Controlling Men” by Lundy Bancroft enough. All the examples are derived from a few decades of working with abusers in personal relationships, but the patterns and archetypes that Lundy Bancroft extracts transfers extremely well to any other kind of relationship, whether that’s work, family or open source communities.

There’s endless amounts of stellar talks about building better communities. I’d like to highlight just two: “Life is better with Rust’s community automation” by Emily Dunham and “Have It Your Way: Maximizing Drive-Thru Contribution” by VM Brasseur. For learning more there’s lots of great community topic tracks at various conferences, but also dedicated ones - often as unconferences: Community Leadership Summit, including its various offsprings and maintainerati are two I’ve been at and learned a lot.

Finally there’s the fun of trying to change a huge existing organization with lots of inertia. “Leading Change” by John Kotter has some good insights and frameworks to approach this challenge.

Despite what it might look like I’m not quitting kernel hacking nor the community, and I’m happy to discuss my talk over mail and in upcoming hallway tracks.

February 02, 2018 12:00 AM

January 23, 2018

Pete Zaitcev: 400 gigabits, every second

I keep waiting for RJ-45 to fail to keep the pace with the gigabits, for many years. And it always catches up. But maybe not anymore. Here's what the connector looks for QSFP-DD, a standard module connector for 400GbE:

Two rows, baby, same as on USB3.

These speeds are mostly used between leaf and spine switches, but I'm sure we'll see them in the upstream routers, too.

January 23, 2018 07:43 PM

January 22, 2018

James Morris: LCA 2018 Kernel Miniconf – SELinux Namespacing Slides

I gave a short talk on SELinux namespacing today at the Kernel Miniconf in Sydney — the slides from the talk are here:

This is a work in progress to which I’ve been contributing, following on from initial discussions at Linux Plumbers 2017.

In brief, there’s a growing need to be able to provide SELinux confinement within containers: typically, SELinux appears disabled within a container on Fedora-based systems, as a workaround for a lack of container support.  Underlying this is a requirement to provide per-namespace SELinux instances,  where each container has its own SELinux policy and private kernel SELinux APIs.

A prototype for SELinux namespacing was developed by Stephen Smalley, who released the code via  There were and still are many TODO items.  I’ve since been working on providing namespacing support to on-disk inode labels, which are represented by security xattrs.  See the v0.2 patch post for more details.

Much of this work will be of interest to other LSMs such as Smack, and many architectural and technical issues remain to be solved.  For those interested in this work, please see the slides, which include a couple of overflow pages detailing some known but as yet unsolved issues (supplied by Stephen Smalley).

I anticipate discussions on this and related topics (LSM stacking, core namespaces) later in the year at Plumbers and the Linux Security Summit(s), at least.

The session was live streamed — I gather a standalone video will be available soon!

ETA: the video is up! See:

January 22, 2018 08:38 AM

January 20, 2018

Pete Zaitcev: NUC versus laptop

When I split off the router, I received a bit of a breather from the Fedora killing i686, because I do not have to upgrade the non-routing server as faithfully as an Internet-facing firewall. Still, eventually I must switch from the ASUS EEEPC to something viable.

So, I considered a NUC, just like the one that Richard W.M. Jones bought. It beats an old laptop in every way. In particular, it's increasingly difficult to disassemble laptops nowadays, and the candidate I have now has is hard drive buried in a particularly vexing way: the whole thing must be taken apart, with a dozen of tiny connectors carefully pried off, before the disk can be extracted. Still, a laptop offers a couple of features. #1: it always has a monitor and keyboard, an #2: it comes with its own uninterruptible power supply. And the cost is already amortized.

Long term, I am inclined to believe that Atwood is right and all user-facing computers will morph into tablets. When that happens, a supply of useful laptops will dry up and I will have to resort to whatever microserver box is available.But that today is not that day.

January 20, 2018 05:03 PM

January 19, 2018

Greg Kroah-Hartman: Meltdown and Spectre Linux kernel status - update

I keep getting a lot of private emails about my previous post about the latest status of the Linux kernel patches to resolve both the Meltdown and Spectre issues.

These questions all seem to break down into two different categories, “What is the state of the Spectre kernel patches?”, and “Is my machine vunlerable?”

State of the kernel patches

As always, covers the technical details about the latest state of the kernel patches to resolve the Spectre issues, so please go read that to find out that type of information.

And yes, it is behind a paywall for a few more weeks. You should be buying a subscription to get this type of thing!

Is my machine vunlerable?

For this question, it’s now a very simple answer, you can check it yourself.

Just run the following command at a terminal window to determine what the state of your machine is:

$ grep . /sys/devices/system/cpu/vulnerabilities/*

On my laptop, right now, this shows:

$ grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable: Minimal generic ASM retpoline

This shows that my kernel is properly mitigating the Meltdown problem by implementing PTI (Page Table Isolation), and that my system is still vulnerable to the Spectre variant 1, but is trying really hard to resolve the variant 2, but is not quite there (because I did not build my kernel with a compiler to properly support the retpoline feature).

If your kernel does not have that sysfs directory or files, then obviously there is a problem and you need to upgrade your kernel!

Some “enterprise” distributions did not backport the changes for this reporting, so if you are running one of those types of kernels, go bug the vendor to fix that, you really want a unified way of knowing the state of your system.

Note that right now, these files are only valid for the x86-64 based kernels, all other processor types will show something like “Not affected”. As everyone knows, that is not true for the Spectre issues, except for very old CPUs, so it’s a huge hint that your kernel really is not up to date yet. Give it a few months for all other processor types to catch up and implement the correct kernel hooks to properly report this.

And yes, I need to go find a working microcode update to fix my laptop’s CPU so that it is not vulnerable against Spectre…

January 19, 2018 10:30 AM

Gustavo F. Padovan: Save the date! Annoucing linuxdev-br Conference 2018

We are proud to tell you that the second edition of the linuxdev-br conference will happen on August 25th and 26th, 2018 again at the University of Campinas. The first edition, last November, was a massive success and now the second edition will happen in a bigger place to fit more people with a duration of two days, so it can fit a wider range of talks without preventing the attendees from connecting to each other during the coffee-breaks and happy hours!

Stay tuned for more updates, soon we will publish a call for talks and open the registrations. We want to make linuxdev-br always better! See you there! :)

January 19, 2018 12:44 AM

January 18, 2018

Pavel Machek: Fun with Rust (not spinning this time)

Rust... took me while to install. I decided I did not like curl | sh, so I created fresh VM for that. That took a while, and in the end I ran curl | sh, anyway. I coded weather forecast core in Rust... And I feel like every second line needs explicit typecast. Not nice, but ok; result will be fast, right? Rust: 6m45 seconds, python less then 1m7 seconds. Ouch. Ok, rust really needs optimalizations to be anywhere near reasonable run-time speed. 7 seconds optimized. Compile time is... 4 seconds for 450 lines of code. Hmm. Not great. .. but I guess better than alternatives.

January 18, 2018 06:42 PM

Pavel Machek: Hey Intel, what about an apology?

Hey, Intel. You were selling faulty CPUs for 15+ years, you are still selling faulty CPUs, and there are no signs you even intend to fix them. You sold faulty CPUs for half a year, knowing they are faulty, without telling you customers. You helped develop band-aids for subset of problems, and subset of configurations. Yeah, so there's work around for Meltdown on 64-bit Linux. Where's work around for Meltdown on 32-bit? What about BSDs? MINIX? L4? Where are work arounds for Spectre? And more importantly -- where are real fixes? You know, your CPUs fail to do security checks in time. Somehow I think that maybe you should fix your CPUs? I hearing you want to achieve “quantum supremacy". But maybe I'd like to hear how you intend to fix the mess you created, first? I actually started creating a workaround for x86-32, but I somehow feel like I should not be the one fixing this. I'm willing to test the patches...

(And yes, Spectre is industry-wide problem. Meltdown is -- you screwed it up.)

January 18, 2018 06:38 PM

January 17, 2018

Matthew Garrett: Privacy expectations and the connected home

Traditionally, devices that were tied to logins tended to indicate that in some way - turn on someone's xbox and it'll show you their account name, run Netflix and it'll ask which profile you want to use. The increasing prevalence of smart devices in the home changes that, in ways that may not be immediately obvious to the majority of people. You can configure a Philips Hue with wall-mounted dimmers, meaning that someone unfamiliar with the system may not recognise that it's a smart lighting system at all. Without any actively malicious intent, you end up with a situation where the account holder is able to infer whether someone is home without that person necessarily having any idea that that's possible. A visitor who uses an Amazon Echo is not necessarily going to know that it's tied to somebody's Amazon account, and even if they do they may not know that the log (and recorded audio!) of all interactions is available to the account holder. And someone grabbing an egg out of your fridge is almost certainly not going to think that your smart egg tray will trigger an immediate notification on the account owner's phone that they need to buy new eggs.

Things get even more complicated when there's multiple account support. Google Home supports multiple users on a single device, using voice recognition to determine which queries should be associated with which account. But the account that was used to initially configure the device remains as the fallback, with unrecognised voices ended up being logged to it. If a voice is misidentified, the query may end up being logged to an unexpected account.

There's some interesting questions about consent and expectations of privacy here. If someone sets up a smart device in their home then at some point they'll agree to the manufacturer's privacy policy. But if someone else makes use of the system (by pressing a lightswitch, making a spoken query or, uh, picking up an egg), have they consented? Who has the social obligation to explain to them that the information they're producing may be stored elsewhere and visible to someone else? If I use an Echo in a hotel room, who has access to the Amazon account it's associated with? How do you explain to a teenager that there's a chance that when they asked their Home for contact details for an abortion clinic, it ended up in their parent's activity log? Who's going to be the first person divorced for claiming that they were vegan but having been the only person home when an egg was taken out of the fridge?

To be clear, I'm not arguing against the design choices involved in the implementation of these devices. In many cases it's hard to see how the desired functionality could be implemented without this sort of issue arising. But we're gradually shifting to a place where the data we generate is not only available to corporations who probably don't care about us as individuals, it's also becoming available to people who own the more private spaces we inhabit. We have social norms against bugging our houseguests, but we have no social norms that require us to explain to them that there'll be a record of every light that they turn on or off. This feels like it's going to end badly.

(Thanks to Nikki Everett for conversations that inspired this post)

(Disclaimer: while I work for Google, I am not involved in any of the products or teams described in this post and my opinions are my own rather than those of my employer's)

comment count unavailable comments

January 17, 2018 09:45 PM

January 15, 2018

Pete Zaitcev: New toy

Guess what.

A Russian pillowcase is much wider (or squar-er) than tubular American ones, so it works perfectly as a cover.

January 15, 2018 09:38 PM

January 12, 2018

Pete Zaitcev: Old news

Per U.S. News:

Alphabet Inc's (GOOG, GOOGL) Google said in 2016 that it was designing a server based on International Business Machines Corp's (IBM) Power9 processor.

Have they put anything into production since then? If not, why bring this up?

UPDATE: R. Hubbell writes by e-mail:

So yes I think the move to the IBM is due to their encounter of the exploits.

A lot of lip service is given to the hazards of the monoculture. But why PPC of all things? Is Google becoming incapable of dealing with any supplier that is not a megacorp?

January 12, 2018 04:07 PM

January 11, 2018

Pete Zaitcev: A split home network

Real quick, why a 4-port router was needed.

  1. Red: Upstream link to ISP
  2. Grey: WiFi
  3. Blue: Entertainment stack
  4. Green: General Ethernet

The only reason to split the blue network is to prevent TiVo from attacking other boxes, such as desktops and printers. Yes, this is clearly not paranoid enough for a guy who insists on a dumb TV.

January 11, 2018 04:53 AM

January 10, 2018

Pete Zaitcev: Buying a dumb TV in 2018 America

I wanted to buy a TV a month ago and found that almost all of them are "Smart" nowadays. When I asked for a conventional TV, people ranging from a floor worker at Best Buy to Nikita Danilov at Facebook implied that I was an idiot. Still, I succeeded.

At first, I started looking at what is positioned as "conference room monitor". The NEC E506 is far away the leader, but it's expensive at $800 or so.

Then, I went to Fry's, who advertise quasi-brands like SILO. They had TVs on display, but were out. I was even desperate enough to be upsold to Athyme for $450, but they fortunately were out of that one too.

At that point, I headed to Best Buy, who have an exclusive agreement with Toshiba (h/t Matt Kern on Facebook). I was not happy to support this kind of distasteful arrangement, but very few options remained. There, it was either waiting for delivery, or driving 3 hours to a warehouse store. Considering how much my Jeep burns per mile, I declined.

Finally, I headed to a local Wal-Mart and bought a VISIO for $400 out the door. No fuss, no problem, easy peasy. Should've done that from the start.

P.S. Some people suggested buying a Smart TV and then not plugging it in. It includes not giving it the password for the house WiFi. Unfortunately, it is still problematic, as some of these TVs will associate with any open wireless network by default. An attacker drives by with a passwordless AP, and roots all TVs on the block. Unfortunately, I live an high-tech area where stuff like that happens all the time. When I mentioned it to Nikita, he thought that I was an idiot for sure. It's like a Russian joke about "dropping everything and moving to Uryupinsk."

January 10, 2018 06:29 PM

James Bottomley: GPL as the Best Licence – Governance and Philosophy

In the first part I discussed the balancing mechanisms the GPL provides for enabling corporate contributions, giving users a voice and the right one for mutually competing corporations to collaborate on equal terms.  In this part I’ll look at how the legal elements of the GPL licences make it pretty much the perfect one for supporting a community of developers co-operating with corporations and users.

As far as a summary of my talk goes, this series is complete.  However, I’ve been asked to add some elaboration on the legal structure of GPL+DCO contrasted to other CLAs and also include AGPL, so I’ll likely do some other one off posts in the Legal category about this.

Free Software vs Open Source

There have been many definitions of both of these.  Rather than review them, in the spirit of Humpty Dumpty, I’ll give you mine: Free Software, to me, means espousing a set of underlying beliefs about the code (for instance the four freedoms of the FSF).  While this isn’t problematic for many developers (code freedom, of course, is what enables developer driven communities) it is an anathema to most corporations and in particular their lawyers because, generally applied, it would require the release of all software based intellectual property.  Open Source on the other hand, to me, means that you follow all the rules of the project (usually licensing and contribution requirements) but don’t necessarily sign up to the philosophy underlying the project (if there is one; most Open Source projects won’t have one).

Open Source projects are compatible with Corporations because, provided they have some commonality in goals, even a corporation seeking to exploit a market can march a long way with a developer driven community before the goals diverge.  This period of marching together can be extremely beneficial for both the project and the corporation and if corporate priorities change, the corporation can simply stop contributing.  As I have stated before, Community Managers serve an essential purpose in keeping this goal alignment by making the necessary internal business adjustments within a corporation and by explaining the alignment externally.

The irony of the above is that collaborating within the framework of the project, as Open Source encourages, could work just as well for a Free Software project, provided the philosophical differences could be overcome (or overlooked).  In fact, one could go a stage further and theorize that the four freedoms as well as being input axioms to Free Software are, in fact, the generated end points of corporate pursuit of Open Source, so if the Open Source model wins in business, there won’t actually be a discernible difference between Open Source and Free Software.

Licences and Philosophy

It has often been said that the licence embodies the philosophy of the project (I’ve said it myself on more than one occasion, for which I’d now like to apologize).  However, it is an extremely reckless statement because it’s manifestly untrue in the case of GPL.  Neither v2 nor v3 does anything to require that adopters also espouse the four freedoms, although it could be said that the Tivoization Clause of v3, to which the kernel developers objected, goes slightly further down the road of trying to embed philosophy in the licence.  The reason for avoiding this statement is that it’s very easy for an inexperienced corporation (or pretty much any corporate legal counsel with lack of Open Source familiarity) to take this statement at face value and assume adopting the code or the licence will force some sort of viral adoption of a philosophy which is incompatible with their current business model and thus reject the use of reciprocal licences altogether.  Whenever any corporation debates using or contributing to Open Source, there’s inevitably an internal debate and this licence embeds philosophy argument is a powerful weapon for the Open Source opponents.

Equity in Contribution Models

Some licensing models, like those pioneered by Apache, are thought to require a foundation to pass the rights through under the licence: developers (or their corporations) sign a Contributor Licence Agreement (CLA) which basically grants the foundation redistributable licences to both copyrights and patents in the code and then the the Foundation licenses the contribution to the Project under Apache-2.  The net result is the outbound rights (what consumers of the project gets) are Apache-2 but the inbound rights (what contributors are required to give away) are considerably more.  The danger in this model is that control of the foundation gives control of the inbound rights, so who controls the foundation and how control may be transferred forms an important part of the analysis of what happens to contributor rights.  Note that this model is also the basis of open core, with a corporation taking the place of the foundation.

Inequity in the inbound versus the outbound rights creates an imbalance of power within the project between those who possess the inbound rights and everyone else (who only possess the outbound rights) and can damage developer driven communities by creating an alternate power structure (the one which controls the IP rights).  Further, the IP rights tend to be a focus for corporations, so simply joining the controlling entity (or taking a licence from it) instead of actually contributing to the project can become an end goal, thus weakening the technical contributions to the project and breaking the link with end users.

Creating equity in the licensing framework is thus a key to preserving the developer driven nature of a community.  This equity can be preserved by using the Inbound = Outbound principle, first pioneered by Richard Fontana, the essential element being that contributors should only give away exactly the rights that downstream recipients require under the licence.  This structure means there is no need for a formal CLA and instead a model like the Developer Certificate of Origin (DCO) can be used whereby the contributor simply places a statement in the source control of the project itself attesting to giving away exactly the rights required by the licence.  In this model, there’s no requirement to store non-electronic copies of the the contribution attestation (which inevitably seem to get lost), because the source control system used by the project does this.  Additionally, the source browsing functions of the source control system can trace a single line of code back exactly to all the contributor attestations thus allowing fully transparent inspection and independent verification of all the inbound contribution grants.

The Dangers of Foundations

Foundations which have no special inbound contribution rights can still present a threat to the project by becoming an alternate power structure.  In the worst case, the alternate power structure is cemented by the Foundation having a direct control link with the project, usually via some Technical Oversight Committee (TOC).  In this case, the natural Developer Driven nature of the project is sapped by the TOC creating uncertainty over whether a contribution should be accepted or not, so now the object isn’t to enthuse fellow developers, it’s to please the TOC.  The control confusion created by this type of foundation directly atrophies the project.

Even if a Foundation specifically doesn’t create any form of control link with the project, there’s still the danger that a corporation’s marketing department sees joining the Foundation as a way of linking itself with the project without having to engage the engineering department, and thus still causing a weakening in both potential contributions and the link between the project and its end users.

There are specific reasons why projects need foundations (anything requiring financial resources like conferences or grants requires some entity to hold the cash) but they should be driven by the need of the community for a service and not by the need of corporations for an entity.

GPL+DCO as the Perfect Licence and Contribution Framework

Reciprocity is the key to this: the requirement to give back the modifications levels the playing field for corporations by ensuring that they each see what the others are doing.  Since there’s little benefit (and often considerable down side) to hiding modifications and doing a dump at release time, it actively encourages collaboration between competitors on shared features.  Reciprocity also contains patent leakage as we saw in Part 1.  Coupled with the DCO using the Inbound = Outbound principle, means that the Licence and DCO process are everything you need to form an effective and equal community.

Equality enforced by licensing coupled with reciprocity also provides a level playing field for corporate contributors as we saw in part 1, so equality before the community ensures equity among all participants.  Since this is analogous to the equity principles that underlie a lot of the world’s legal systems, it should be no real surprise that it generates the best contribution framework for the project.  Best of all, the model works simply and effectively for a group of contributors without necessity for any more formal body.

Contributions and Commits

Although GPL+DCO can ensure equity in contribution, some human agency is still required to go from contribution to commit.  The application of this agency is one of the most important aspects to the vibrancy of the project and the community.  The agency can be exercised by an individual or a group; however, the composition of the agency doesn’t much matter, what does is that the commit decisions of the agency essentially (and impartially) judge the technical merit of the contribution in relation to the project.

A bad commit agency can be even more atrophying to a community than a Foundation because it directly saps the confidence the community has in the ability of good (or interesting) code to get into the tree.  Conversely, a good agency is simply required to make sound technical decisions about the contribution, which directly preserves the confidence of the community that good code gets into the tree.   As such, the role requires leadership, impartiality and sound judgment rather than any particular structure.

Governance and Enforcement

Governance seems to have many meanings depending on context, so lets narrow it to the rules by which the project is run (this necessarily includes gathering the IP contribution rights) and how they get followed. In a GPL+DCO framework, the only additional governance component required is the commit agency.

However, having rules isn’t sufficient unless you also follow them; in other words you need some sort of enforcement mechanism.  In a non-GPL+DCO system, this usually involves having an elaborate set of sanctions and some sort of adjudication system, which, if not set up correctly, can also be a source of inequity and project atrophy.  In a GPL+DCO system, most of the adjudication system and sanctions can be replaced by copyright law (this was the design of the licence, after all), which means licence enforcement (or at least the threat of it) becomes the enforcement mechanism.  The only aspect of governance this doesn’t cover is the commit agency.  However, with no other formal mechanisms to support its authority, the commit agency depends on the trust of the community to function and could easily be replaced by that community simply forking the tree and trusting a new commit agency.

The two essential corollaries of the above is that enforcement does serve an essential governance purpose in a GPL+DCO ecosystem and lack of a formal power structure keeps the commit agency honest because the community could replace it.

The final thing worth noting is that too many formal rules can also seriously weaken a project by encouraging infighting over rule interpretations, how exactly they should be followed and who did or did not dot the i’s and cross the t’s.  This makes the very lack of formality and lack of a formalised power structure which the GPL+DCO encourages a key strength of the model.


In the first part I concluded that the GPL fostered the best ecosystem between developers, corporations and users by virtue of the essential ecosystem fairness it engenders.  In this part I conclude that formal control structures are actually detrimental to a developer driven community and thus the best structural mechanism is pure GPL+DCO with no additional formality.  Finally I conclude that this lack of ecosystem control is no bar to strong governance, since that can be enforced by any contributor through the copyright mechanism, and the very lack of control is what keeps the commit agency correctly serving the community.

January 10, 2018 04:38 PM

January 08, 2018

Pete Zaitcev: Caches are like the government

From an anonymous author, a follow-up to the discussion about the cache etc.:

counterpoint 1: Itanium, which was EPIC like Elbrus, failed even with Intel behind it. And it added prefetching before the end. Source:

counterpoint 2: To get fast, Elbrus has also added at least one kind of prefetch (APB, "Array Prefetch Buffer") and has the multimegabyte cache that Zaitcev decries. Source: [kozhin2016, 10.1109/EnT.2016.027]

counterpoint 3: "According to Keith Diefendorff, in 1978 almost 15 years ahead of Western superscalar processors, Elbrus implemented a two-issue out-of-order processor with register renaming and speculative execution"

1. Itanium, as I recall, suffered from the poor initial implementation too much. Remember that 1st implementation was designed in Intel, while the 2nd implementation was designed at HP. Intel's chip stunk on ice. By the time HP came along, AMD64 became a thing, and then it was over.

Would Itanium win over the AMD64 if it were better established, burned less power, and were faster, sooner? There's no telling. The compatibility is an important consideration, and the binary translation was very shaky back then, unless you count Crusoe.

2. It's quite true that modern Elbrus runs with a large cache. That is because cache is obviously beneficial. All this is about is to consider once again if better software control of caches, and their better architecture in general, would disrupt side-channel signalling and bring performance advantages.

By the way, people might not remember it now, but a large chunk of Opteron's performance derived from its excellent memory controller. It's a component of CPU that tended not to get noticed, but it's essential. Fortunately, the Rowhammer vulnerability drew some much-needed attention to it, as well as a possible role for software control there.

3. Well, Prof. Babayan's own outlook at Elbrus-2 and its superscalar, out-of-order core was, "As you can see, I tried this first, and found that VLIW was better", which is why Elbrus-3 disposed with all that stuff. Naturally, all that stuff came back when we started to find the limits of EPIC (nee VLIW), just like the cache did.

January 08, 2018 10:51 PM

January 06, 2018

Greg Kroah-Hartman: Meltdown and Spectre Linux kernel status

By now, everyone knows that something “big” just got announced regarding computer security. Heck, when the Daily Mail does a report on it , you know something is bad…

Anyway, I’m not going to go into the details about the problems being reported, other than to point you at the wonderfully written Project Zero paper on the issues involved here. They should just give out the 2018 Pwnie award right now, it’s that amazingly good.

If you do want technical details for how we are resolving those issues in the kernel, see the always awesome writeup for the details.

Also, here’s a good summary of lots of other postings that includes announcements from various vendors.

As for how this was all handled by the companies involved, well this could be described as a textbook example of how NOT to interact with the Linux kernel community properly. The people and companies involved know what happened, and I’m sure it will all come out eventually, but right now we need to focus on fixing the issues involved, and not pointing blame, no matter how much we want to.

What you can do right now

If your Linux systems are running a normal Linux distribution, go update your kernel. They should all have the updates in them already. And then keep updating them over the next few weeks, we are still working out lots of corner case bugs given that the testing involved here is complex given the huge variety of systems and workloads this affects. If your distro does not have kernel updates, then I strongly suggest changing distros right now.

However there are lots of systems out there that are not running “normal” Linux distributions for various reasons (rumor has it that it is way more than the “traditional” corporate distros). They rely on the LTS kernel updates, or the normal stable kernel updates, or they are in-house franken-kernels. For those people here’s the status of what is going on regarding all of this mess in the upstream kernels you can use.

Meltdown – x86

Right now, Linus’s kernel tree contains all of the fixes we currently know about to handle the Meltdown vulnerability for the x86 architecture. Go enable the CONFIG_PAGE_TABLE_ISOLATION kernel build option, and rebuild and reboot and all should be fine.

However, Linus’s tree is currently at 4.15-rc6 + some outstanding patches. 4.15-rc7 should be out tomorrow, with those outstanding patches to resolve some issues, but most people do not run a -rc kernel in a “normal” environment.

Because of this, the x86 kernel developers have done a wonderful job in their development of the page table isolation code, so much so that the backport to the latest stable kernel, 4.14, has been almost trivial for me to do. This means that the latest 4.14 release (4.14.12 at this moment in time), is what you should be running. 4.14.13 will be out in a few more days, with some additional fixes in it that are needed for some systems that have boot-time problems with 4.14.12 (it’s an obvious problem, if it does not boot, just add the patches now queued up.)

I would personally like to thank Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra, Josh Poimboeuf, Juergen Gross, and Linus Torvalds for all of the work they have done in getting these fixes developed and merged upstream in a form that was so easy for me to consume to allow the stable releases to work properly. Without that effort, I don’t even want to think about what would have happened.

For the older long term stable (LTS) kernels, I have leaned heavily on the wonderful work of Hugh Dickins, Dave Hansen, Jiri Kosina and Borislav Petkov to bring the same functionality to the 4.4 and 4.9 stable kernel trees. I had also had immense help from Guenter Roeck, Kees Cook, Jamie Iles, and many others in tracking down nasty bugs and missing patches. I want to also call out David Woodhouse, Eduardo Valentin, Laura Abbott, and Rik van Riel for their help with the backporting and integration as well, their help was essential in numerous tricky places.

These LTS kernels also have the CONFIG_PAGE_TABLE_ISOLATION build option that should be enabled to get complete protection.

As this backport is very different from the mainline version that is in 4.14 and 4.15, there are different bugs happening, right now we know of some VDSO issues that are getting worked on, and some odd virtual machine setups are reporting strange errors, but those are the minority at the moment, and should not stop you from upgrading at all right now. If you do run into problems with these releases, please let us know on the stable kernel mailing list.

If you rely on any other kernel tree other than 4.4, 4.9, or 4.14 right now, and you do not have a distribution supporting you, you are out of luck. The lack of patches to resolve the Meltdown problem is so minor compared to the hundreds of other known exploits and bugs that your kernel version currently contains. You need to worry about that more than anything else at this moment, and get your systems up to date first.

Also, go yell at the people who forced you to run an obsoleted and insecure kernel version, they are the ones that need to learn that doing so is a totally reckless act.

Meltdown – ARM64

Right now the ARM64 set of patches for the Meltdown issue are not merged into Linus’s tree. They are staged and ready to be merged into 4.16-rc1 once 4.15 is released in a few weeks. Because these patches are not in a released kernel from Linus yet, I can not backport them into the stable kernel releases (hey, we have rules for a reason…)

Due to them not being in a released kernel, if you rely on ARM64 for your systems (i.e. Android), I point you at the Android Common Kernel tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and 4.9 branches as of this point in time.

I would strongly recommend just tracking those branches as more fixes get added over time due to testing and things catch up with what gets merged into the upstream kernel releases over time, especially as I do not know when these patches will land in the stable and LTS kernel releases at this point in time.

For the 4.4 and 4.9 LTS kernels, odds are these patches will never get merged into them, due to the large number of prerequisite patches required. All of those prerequisite patches have been long merged and tested in the android-common kernels, so I think it is a better idea to just rely on those kernel branches instead of the LTS release for ARM systems at this point in time.

Also note, I merge all of the LTS kernel updates into those branches usually within a day or so of being released, so you should be following those branches no matter what, to ensure your ARM systems are up to date and secure.


Now things get “interesting”…

Again, if you are running a distro kernel, you might be covered as some of the distros have merged various patches into them that they claim mitigate most of the problems here. I suggest updating and testing for yourself to see if you are worried about this attack vector

For upstream, well, the status is there is no fixes merged into any upstream tree for these types of issues yet. There are numerous patches floating around on the different mailing lists that are proposing solutions for how to resolve them, but they are under heavy development, some of the patch series do not even build or apply to any known trees, the series conflict with each other, and it’s a general mess.

This is due to the fact that the Spectre issues were the last to be addressed by the kernel developers. All of us were working on the Meltdown issue, and we had no real information on exactly what the Spectre problem was at all, and what patches were floating around were in even worse shape than what have been publicly posted.

Because of all of this, it is going to take us in the kernel community a few weeks to resolve these issues and get them merged upstream. The fixes are coming in to various subsystems all over the kernel, and will be collected and released in the stable kernel updates as they are merged, so again, you are best off just staying up to date with either your distribution’s kernel releases, or the LTS and stable kernel releases.

It’s not the best news, I know, but it’s reality. If it’s any consolation, it does not seem that any other operating system has full solutions for these issues either, the whole industry is in the same boat right now, and we just need to wait and let the developers solve the problem as quickly as they can.

The proposed solutions are not trivial, but some of them are amazingly good. The Retpoline post from Paul Turner is an example of some of the new concepts being created to help resolve these issues. This is going to be an area of lots of research over the next years to come up with ways to mitigate the potential problems involved in hardware that wants to try to predict the future before it happens.

Other arches

Right now, I have not seen patches for any other architectures than x86 and arm64. There are rumors of patches floating around in some of the enterprise distributions for some of the other processor types, and hopefully they will surface in the weeks to come to get merged properly upstream. I have no idea when that will happen, if you are dependant on a specific architecture, I suggest asking on the arch-specific mailing list about this to get a straight answer.


Again, update your kernels, don’t delay, and don’t stop. The updates to resolve these problems will be continuing to come for a long period of time. Also, there are still lots of other bugs and security issues being resolved in the stable and LTS kernel releases that are totally independent of these types of issues, so keeping up to date is always a good idea.

Right now, there are a lot of very overworked, grumpy, sleepless, and just generally pissed off kernel developers working as hard as they can to resolve these issues that they themselves did not cause at all. Please be considerate of their situation right now. They need all the love and support and free supply of their favorite beverage that we can provide them to ensure that we all end up with fixed systems as soon as possible.

January 06, 2018 12:36 PM

January 05, 2018

Pete Zaitcev: Police action in the drone-to-helicopter collision

The year 2017 was the first year when a civilian multicopter drone collided with a manned aircraft. It was expected for a while and there were several false starts. One thing is curious though - how did they find the operator of the drone? I presume it wasn't something simple like a post on Facebook with a video of the collision. They must've polled witnesses in the area, then looked at surveilance cameras or whatnot, to get it narrowed to vehicles.

UPDATE: Readers mkevac and veelmore inform that a serialized part of the drone was recovered, and the investigators worked through seller records to identify the buyer.

January 05, 2018 11:06 PM

Pete Zaitcev: Prof. Babayan's Revenge

Someone at GNUsocial posted:

I suspect people trying to find alternate CPU architectures that don't suffer from #Spectre - like bugs have misunderstood how fundamental the problem is.Your CPU will not go fast without caches. Your CPU will not go fast without speculative execution. Solving the problem will require more silicon, not less. I don't think the market will accept the performance hit implied by simpler architectures. OS, compiler and VM (including the browser) workarounds are the way this will get mitigated.

CPUs will not go fast without caches and speculative execution, you say? Prof. Babayan may have something to say about that. Back when I worked under him in the 1990s, he considered caches a primitive workaround.

The work on Narch was informed by the observation that the submicron feature size provided designers with more silicon they knew what to do with. So, the task of a CPU designer was to identify ways to use massive amounts of gates productively. But instead, mediocre designers simply added more cache, even multi-level cache.

Talking about it was not enough, so he set out to design and implement his CPU, called "Narch" (later commercialized as "Elbrus-2000"). And he did. The performance was generally on par with its contemporaries, such as Pentium III and UltraSparc. It had a cache, but measured in kilobytes, not megabytes. But there were problems beyond the cache.

The second part of the Bee Yarn Knee's objection deals with the speculative execution. Knocking that out required a software known as a binary translator, which did basically the same thing, only in software[*]. Frankly at this point I cannot guarantee that it weren't possible to abuse that mechanism for unintentional signaling in the same ways Meltdown works. You don't have cache for timing signals in Narch, but you do have the translator, which can be timed if it runs at run time like in Transmeta Crusoe. In Narch's case it only ran ahead of time, so not exploitable, but the result turned out to be not fast enough for workloads that make a good use of speculative execution today (such as LISP and gcc).

Still, I think that a blanket objection that CPU cannot run fast with no cache and no speculative execution, IMHO, is informed by ignorance of alternatives. I cannot guarantee that E2k would solve the problem for good, after all its later models sit on top of a cache. But at least we have a hint.

[*] The translator grew from a language toolchain and could be used in creative ways to translate source. It would not be binary in such case. I omit a lot of detail here.

UPDATE: Oh, boy:

But the speedup from speculative execution IS from parallelism. We're just asking the CPU to find it instead of the compiler. So couldn't you move the smarts into the compiler?

Sean, this is literally what they said 30 years ago.

January 05, 2018 04:56 PM

January 04, 2018

Kees Cook: SMEP emulation in PTI

An nice additional benefit of the recent Kernel Page Table Isolation (CONFIG_PAGE_TABLE_ISOLATION) patches (to defend against CVE-2017-5754, the speculative execution “rogue data cache load” or “Meltdown” flaw) is that the userspace page tables visible while running in kernel mode lack the executable bit. As a result, systems without the SMEP CPU feature (before Ivy-Bridge) get it emulated for “free”.

Here’s a non-SMEP system with PTI disabled (booted with “pti=off“), running the EXEC_USERSPACE LKDTM test:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: disabled on command line.
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
# dmesg
[   17.883754] lkdtm: Performing direct entry EXEC_USERSPACE
[   17.885149] lkdtm: attempting ok execution at ffffffff9f6293a0
[   17.886350] lkdtm: attempting bad execution at 00007f6a2f84d000

No crash! The kernel was happily executing userspace memory.

But with PTI enabled:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: enabled
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
# dmesg
[   33.657695] lkdtm: Performing direct entry EXEC_USERSPACE
[   33.658800] lkdtm: attempting ok execution at ffffffff926293a0
[   33.660110] lkdtm: attempting bad execution at 00007f7c64546000
[   33.661301] BUG: unable to handle kernel paging request at 00007f7c64546000
[   33.662554] IP: 0x7f7c64546000

It should only take a little more work to leave the userspace page tables entirely unmapped while in kernel mode, and only map them in during copy_to_user()/copy_from_user() as ARM already does with ARM64_SW_TTBR0_PAN (or CONFIG_CPU_SW_DOMAIN_PAN on arm32).

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

January 04, 2018 09:43 PM

Pete Zaitcev: More bugs

Speaking of stupid bugs that make no sense to report, Anaconda fails immediately in F27 if one of the disks has an exported volume group on it — in case you thought it was a clever way to protect some data from being overwritten accidentally by the installation. The workaround was to unplug the drives that contained the PVs in the problematic VG. Now, why not to report this? It's 100% reproducible. But reporting presumes a responsibility to re-test, and I'm not going to install a fresh Fedora in a long time again, hopefully, so I'm not in a position to discharge my bug reporter's responsibilities.

January 04, 2018 08:27 PM

January 03, 2018

James Bottomley: GPL as the best licence – Community, Code and Licensing

This article is the first of  a set supporting the conclusion that the GPL family of copy left licences are the best ones for maintaining a healthy development pace while providing a framework for corporations and users to influence the code base.  It is based on an expansion of the thoughts behind the presentation GPL: The Best Business Licence for Corporate Code at the Compliance Summit 2017 in Yokohama.

A Community of Developers

The standard definition of any group of people building some form of open source software is usually that they’re developers (people with the necessary technical skills to create or contribute to the project).  In pretty much every developer driven community, they’re doing it because they get something out of the project itself (this is the scratch your own itch idea in the Cathedral and the Bazaar): usually because they use the project in some form, but sometimes because they’re fascinated by the ideas it embodies (this latter is really how the Linux Kernel got started because ordinarily a kernel on its own isn’t particularly useful but, for a lot of the developers, the ideas that went into creating unix were enormously fascinating and implementations were completely inaccessible in Europe thanks to the USL vs BSDi lawsuit).

The reason for discussing developer driven communities is very simple: they’re the predominant type of community in open source (Think Linux Kernel, Gnome, KDE etc) which implies that they’re the natural type of community that forms around shared code collaboration.  In this model of interaction, community and code are interlinked: Caring for the code means you also care for the community.  The health of this type of developer community is very easily checked: ask how many contributors would still contribute to the project if they weren’t paid to do it (reduction in patch volume doesn’t matter, just the desire to continue sending patches).  If fewer than 50% of the core contributors would cease contributing if they weren’t paid then the community is unhealthy.

Developer driven communities suffer from three specific drawbacks:

  1. They’re fractious: people who care about stuff tend to spend a lot of time arguing about it.  Usually some form of self organising leadership fixes a significant part of this, but it’s not guaranteed.
  2. Since the code is built by developers for developers (which is why they care about it) there’s no room for users who aren’t also developers in this model.
  3. The community is informal so there’s no organisation for corporations to have a peer relationship with, plus developers don’t tend to trust corporate motives anyway making it very difficult for corporations to join the community.

Trusting Corporations and Involving Users

Developer communities often distrust the motives of corporations because they think corporations don’t care about the code in the same way as developers do.  This is actually completely true: developers care about code for its own sake but corporations care about code only as far as it furthers their business interests.  However, this business interest motivation does provide the basis for trust within the community: as long as the developer community can see and understand the business motivation, they can trust the Corporation to do the right thing; within limits, of course, for instance code quality requirements of developers often conflict with time to release requirements for market opportunity.  This shared interest in the code base becomes the framework for limited trust.

Enter the community manager:  A community manager’s job, when executed properly, is twofold: one is to take corporate business plans and realign them so that some of the corporate goals align with those of useful open source communities and the second is to explain this goal alignment to the relevant communities.  This means that a good community manager never touts corporate “community credentials” but instead explains in terms developers can understand the business reasons why the community and the corporation should work together.  Once the goals are visible and aligned, the developer community will usually welcome the addition of paid corporate developers to work on the code.  Paying for contributions is the most effective path for Corporations to exert significant influence on the community and assurance of goal alignment is how the community understands how this influence benefits the community.

Involving users is another benefit corporations can add to the developer ecosystem.  Users who aren’t developers don’t have the technical skills necessary to make their voices and opinions heard within the developer driven community but corporations, which usually have paying users in some form consuming the packaged code, can respond to user input and could act as a proxy between the user base and the developer community.  For some corporations responding to user feedback which enhances uptake of the product is a natural business goal.  For others, it could be a goal the community manager pushes for within the corporation as a useful goal that would help business and which could be aligned with the developer community.  In either case, as long as the motives and goals are clearly understood, the corporation can exert influence in the community directly on behalf of users.

Corporate Fear around Community Code

All corporations have a significant worry about investing in something which they don’t control. However, these worries become definite fears around community code because not only might it take a significant investment to exert the needed influence, there’s also the possibility that the investment might enable a competitor to secure market advantage.

Another big potential fear is loss of intellectual property in the form of patent grants.  Specifically, permissive licences with patent grants allow any other entity to take the code on which the patent reads, incorporate it into a proprietary code base and then claim the benefit of the patent grant under the licence.  This problem, essentially, means that, unless it doesn’t care about IP leakage (or the benefit gained outweighs the problem caused), no corporation should contribute code to which they own patents under a permissive licence with a patent grant.

Both of these fears are significant drivers of “privatisation”, the behaviour whereby a corporation takes community code but does all of its enhancements and modifications in secret and never contributes them back to the community, under the assumption that bearing the forking cost of doing this as less onerous than the problems above.

GPL is the Key to Allaying these Fears

The IP leak fear is easily allayed: whether the version of GPL that includes an explicit or implicit patent licence, the IP can only leak as far as the code can go and the code cannot be included in a proprietary product because of the reciprocal code release requirements, thus the Corporation always has visibility into how far the IP rights might leak by following licence mandated code releases.

GPL cannot entirely allay the fear of being out competed with your own code but it can, at least, ensure that if a competitor is using a modification of your code, you know about it (as do your competition), so everyone has a level playing field.  Most customers tend to prefer active participants in open code bases, so to be competitive in the market places, corporations using the same code base tend to be trying to contribute actively.  The reciprocal requirements of GPL provide assurance that no-one can go to market with a secret modification of the code base that they haven’t shared with others.  Therefore, although corporations would prefer dominance and control, they’re prepared to settle for a fully level playing field, which the GPL provides.

Finally, from the community’s point of view, reciprocal licences prevent code privatisation (you can still work from a fork, but you must publish it) and thus encourage code sharing which tends to be a key community requirement.


In this first part, I conclude that the GPL, by ensuring fairness between mutually distrustful contributors and stemming IP leaks, can act as a guarantor of a workable code ecosystem for both developers and corporations and, by using the natural desire of corporations to appeal to customers, can use corporations to bridge the gap between user requirements and the developer community.

In the second part of this series, I’ll look at philosophy and governance and why GPL creates self regulating ecosystems which give corporations and users a useful function while not constraining the natural desire of developers to contribute and contrast this with other possible ecosystem models.

January 03, 2018 11:43 PM

January 02, 2018

Pete Zaitcev: The gdm spamming logs in F27, RHbz#1322588

Speaking of the futility of reporting bugs, check out the 1322588. Basically, gdm tries to adjust the screen brightness when a user is already logged in on that screen (fortunately, it fails). Fedora users report the bug, the maintainer asks them to report it upstream. They report it upstream. The upstream tinkers with something tangentially related, closes the bug. Maintainer closes the bug in Fedora. The issue is not fixed, users re-open the bug and the process continues. It was going on for coming up to 2 years now. I don't know why the GNOME upstream cannot program gdm not to screw with the screen after the very same gdm has logged a user in. It's beyond stupid, and I don't know what can be done. I can buy a Mac, I suppose.


-- Comment #71 from Cédric Bellegarde
Simple workaround:
- Disable auto brightness in your gnome session
- Logout and stop gdm
- Copy ~/.config/dconf/user to /var/lib/gdm/.config/dconf/user

UPDATE 2018-01-10:

-- Comment #75 from Bastien Nocera
Maybe you can do something there, instead of posting passive aggressive blog entries.

Back when Bastien maintained xine, we enjoyed a cordial working relationship, but I guess that does not count for anything.

January 02, 2018 11:36 PM

Pete Zaitcev: No more VLAN in the home network

Thanks to Fedora dropping the 32-bit x86 (i686) in F27, I had no choice but to upgrade the home router. I used this opportunity to get rid of VLANs and return to a conventional setup with 4 Eithernet ports. The main reason is, VLANs were not entirely stable in Fedora. Yes, they mostly worked, but I could never be sure that they would continue to work. Also, mostly in this context means, for example, that some time around F24 the boot-up process started hanging on the "Starting the LSB Networking" job for about a minute. It never was worth the trouble raising any bugs or tickets with upstreams, I never was able to resolve a single one of them. Not in Zebra, not in radvd, not in NetworkManager. Besides, if something is broken, I need a solution right now, not when developers turn it around. I suppose VLANs could be allright if I stuck to initscripts, but I needed NetworkManager to interact properly with the upstream ISP at some point. So, whatever. Fedora costed me $150 for the router and killed my VLAN setup.

I looked at ARM routers, but there was nothing. Or, nothing affordable that was SBSA and RHEL compatible. Sorry, ARM, you're still immature. Give me a call when you grow up.

Buying from Chinese was a mostly typical experience. They try to do good, but... Look at the questions about the console pinout at Amazon. The official answer is, "Hello,the pinouts is 232." Yes, really. When I tried to contact them by e-mail, they sent me a bunch of pictures that included pinouts for Ethernet RJ-45, pinout for motherboard header, and a photograph of a Cisco console cable. No, they don't use Cisco pinout. Instead, they use DB9 pin numbers on RJ-45 (obviously, pin 9 is not connected). It was easy to figure out using a multimeter, but I thought I'd ask properly first. The result was very stereotypical.

P.S. The bright green light is blink(1), a Christmas present from my daughter. I'm not yet using it to its full potential. The problem is, if it only shows a static light, it cannot indicate if the router hangs or fails to boot. It needs some kind of daemon job that constantly changes it.

P.P.S. The SG200 is probably going into the On-Q closet, where it may actually come useful.

P.P.P.S. There's a PoE injector under the white cable loop somewhere. It powers a standalone Cisco AP, a 1040 model.

January 02, 2018 11:08 PM

January 01, 2018

Paul E. Mc Kenney: 2017 Year-End Advice

One of the occupational hazard of being an old man is the urge to provide unsolicited advice on any number of topics. This time, the topic is weight lifting.

Some years ago, I decided to start lifting weights. My body no longer tolerated running, so I had long since substituted various low-impact mechanical means of aerobic exercise. But there was growing evidence that higher muscle mass is a good thing as one ages, so I figured I should give it a try. This posting lists a couple of my mistakes, which could enable you to avoid them, which in turn could enable you to make brand-spanking new mistakes of your very own design!

The first mistake resulted in sporadic pains in my left palm and wrist, which appeared after many months of upper-body weight workouts. In my experience, at my age, any mention of this sort of thing to medical professionals will result in a tentative diagnosis of arthritis, with the only prescription being continued observation. This experience motivated me to do a bit of self-debugging beforehand, which led me to notice that the pain was only in my left wrist and only in the center of my left palm. This focused my attention on my two middle fingers, especially the one on which I have been wearing a wedding ring pretty much non-stop since late 1985. (Of course, those prone to making a certain impolite hand gesture might have reason to suspect their middle finger.)

So I tried removing my wedding ring. I was unable to do so, even after soaking my hand for some minutes in a bath of water, soap, and ice. This situation seemed like a very bad thing, regardless of what might be causing the pain. I therefore consulted my wife, who suggested a particular jewelry store. Shortly thereafter, I was sitting in a chair while a gentleman used a tiny but effective hand-cranked circular saw to cut through the ring and a couple pairs of pliers to open it up. The gentleman was surprised that it took more than ten turns of the saw to cut through the ring, in contrast to the usual three turns. Apparently wearing a ring for more than 30 years can cause it to work harden.

The next step was for me to go without a ring for a few weeks to allow my finger to decide what size it wanted to be, now that it had a choice. They gave me back the cut-open ring, which I carried in my pocket. Coincidence or not, during that time, the pains in my wrists and palms vanished. Later, jewelry store resized the ring.

I now remove my ring every night. If you take up any sort of weight lifting involving use of your hands, I recommend that you also remove any rings you might wear, just to verify that you still can.

My second mistake was to embark upon a haphazard weight-lifting regime. I felt that this was OK because I wasn't training for anything other than advanced age, so that any imbalances should be fairly easily addressed.

My body had other ideas, especially in connection with the bout of allergy/asthma/sinitus/brochitis/whatever that I have (knock on wood) mostly recovered from. This condition of course results in coughing, in which the muscles surrounding your chest work together to push air out of your lungs as abruptly and quickly as humanly possible. (Interestingly enough, the maximum velocity of cough-driven air seems to be subject to great dispute, perhaps because it is highly variable and because there are so many different places you could measure it.)

The maximum-effort nature of a cough is just fine if your various chest muscles are reasonably evenly matched. Unfortunately, I had not concerned myself with the effects of my weight-lifting regime on my ability to cough, so I learned the hard way that the weaker muscles might object to this treatment, and make their objections known by going into spasms. Spasms involving one's back can be surprisingly difficult to pin down, but for me, otherwise nonsensical shooting pains involving the neck and head are often due to something in my back. I started some simple and gentle back exercises, and also indulged in Warner Brothers therapy, which involves sitting in an easy chair watching Warner Brothers cartoons, assisted by a heating pad lent by my wife.

In summary, if you are starting weight training, (1) take an organized approach and (2) remove any rings you are wearing at least once a week.

Other than that, have a very happy new year!!!

January 01, 2018 02:29 AM

December 29, 2017

Dave Airlie (blogspot): radv and vega conformance test status

We've been passing the vulkan conformance test suite 1.0.2 mustpass list on radv for quite a while now on the CIK/VI/Polaris cards. However Vega hadn't achieved the same pass rate.

With a bunch of fixes I pushed this morning and one fix for all GPUs, we now have the same pass rate on all GPUs and 0 fails.

This means Vega on radv can now be submitted for conformance under Vulkan 1.0,  not sure when I'll get time to do the paperwork, maybe early next year sometime.

December 29, 2017 03:24 AM

December 26, 2017

Pavel Machek: PostmarketOS and digital cameras

I did some talking in 2017. If you want to learn about postmarketOS (in Czech), there's recording at . ELCE talk about status of phone cameras is at .

December 26, 2017 10:59 AM

December 23, 2017

Paul E. Mc Kenney: Book review: "Engineering Reminiscences" and "Tears of Stone"

I believe that Charles T. Porter's “Engineering Reminiscences“ was a gift from my grandfather, who was himself a machinist. Porter's most prominent contribution was the high-speed steam engine, that is to say, a steam engine operating at more than about 100 RPM. Although steam engines and their governors proved to be somewhat of a dead end, some of his dynamic balancing techniques are still in use.

Technology changes, people and organizations not so much. Chapter XVII starting on page 189 describes a demonstration of two of his new high-speed steam engines (on operating at 150 RPM the other at 300 RPM) along with one of his colleague's new boilers at the 1870 Fair of the American Institute in New York. The boiler ran slanted water tubes through the firebox to more efficiently separate steam from the remaining water. The engines were small by 1870s standards, one having 16-inch diameter cylinders with a 30-inch stroke and the other having 6-inch diameter cylinders with a 12-inch stroke.

Other exhibitors also had boilers and steam engines, and yet other exhibitors had equipment driven by steam engines. All the boilers and steam engines where connected, but given that steam engines were, then as now, considered to be way cooler than mere boilers, it should not be too surprising that the boilers could not produce enough steam to keep all the engines running. In fact, by the end of the day, the steam pressure had dropped by half, resulting in great consternation and annoyance all around. The finger of suspicion quickly pointed at Porter's two high-speed steam engines—after all, great speed clearly must imply equally great consumption of steam, right?

Porter had anticipated this situation, and had therefore installed a shutoff valve that isolated the boiler and his two high-speed steam engines from the rest of the Fair's equipment. Porter therefore closed his valve, with the result that the steam pressure within his little steam network immediately rose to 70 PSI and the pressure to the rest of the network dropped to 25 PSI. In fact, the boiler generated excess steam even at 70 PSI, so that the fireman had to leave the firebox door slightly open to artificially lower the boiler temperature.

The steam pressure to the rest of the fair continued to decrease until it was but 15 PSI. Over the noon hour, an additional boiler was installed, which brought the pressure up to 70 PSI. Restarting the steam engines of course reduced the pressure, but at 5PM it was still 25 PSI.

The superintendent of the machinery department had repeatedly asked Porter to reopen the valve, but each time Porter had refused. At 5PM, the superintendent made it clear that his request was now a demand, and that if Porter would not open the valve, the superintendent would open it for him. Porter finally agreed to open the valve, but only on the condition that the other managers of the institute verify that the boiler was in fact generating more than enough steam for both engines. These managers were summoned forthwith, and they agreed that the boiler had been producing most of the show's steam and that the pair of high-speed steam engines had been consuming very little. Porter opened the valve, and there was no further trouble with low-pressure steam.

It is all too easy to imagine a roughly similar story unfolding in today's world. ;–)

Porter went on to develop steam engines capable of running well in excess of 1,000 RPM, with one key challenge being convincing onlookers that the motion-blurred engine really was running that fast.

Interestingly enough, steam engines were Porter's third career. He was a lawyer for several years, but became disgusted with legal practice. At about that same time, he became quite interested in the problem of facing stone, that is, producing a machine that would take a rough-cut stone and give it a smooth planar face (smooth by the standards of the mid-1800s, anyway). After a couple of years of experimentation, he produced a steam-powered machine that efficiently faced stone. Unfortunately, at about that same time, others realized that saws could even more efficiently face stone, so his invention was what we might now call a technical success and a business failure.

Oddly enough, we have recently learned that the application of saws to stone was not an invention of the mid-1800s, but rather a re-invention of a technique used heavily in the ancient Roman Empire, and suspected of having been used as early as the 13th century BC. This is one of many interesting nuggets on life in the Roman Empire brought out by the historical novel “Tears of Stone” by Vannoy and Zeiglar. This novel is informed by Zeigler's application of Cold War remote-sensing technology to interesting areas of the Italian landscape, a fact that I had the privilege of learning directly from Zeigler himself.

On the other hand, perhaps Porter's ghost can console himself with the fact that the earliest stone saws were hand-powered, and those of the Roman Empire were water powered. Porter's stone-facing machine was instead powered by modern steam engines. Yes, the ancient Egyptians also made some use of steam power, but as far as we know they never applied it industrially, and never via a reciprocating engine driving a rotary shaft. And yes, all of the qualifiers in the preceding sentence are necessary.

As we learn more about ancient civilizations, it will be interesting to see what other “modern inventions” turn out to have deep roots in ancient times!

December 23, 2017 10:29 PM

Paul E. Mc Kenney: Book review: "Make Trouble"

This book, by John Waters of “Hairspray” fame, was an impulse purchase. After all, who could fail to notice a small pink book with large white textured letters saying “Make Trouble”? It is a transcription of Waters's commencement address to the Rhode Institute School of Design's Class of 2015. Those who have known me over several decades might be surprised by this purchase, but what old man could resist a book whose flyleaf states “Anyone embarking on a creative path, he tells us, would do well to realize that pragmatism and discipline are as important as talent and that rejection is nothing to fear.”

They might be even more surprised that I agree with much of his advice. For but three examples:

  1. “A career in the arts is like a hitchhiking trip: All you need is one person to say ‘get in,’ and off you go.” Not really any different from my advising people to use the “high-school boy” algorithm when submitting papers and proposals.
  2. “Keep up with what's causing chaos in your field.” Not really any different from my “Go where there is trouble!”
  3. “Listen to your political enemies, particularly the smart ones”. Me, I would omit the word “political”, but close enough.
The book is mostly pictures, so if you are short of money, you do have the option of just reading it in the bookstore. See, I am making trouble already! ;–)

December 23, 2017 08:46 PM

December 21, 2017

Matthew Garrett: When should behaviour outside a community have consequences inside it?

Free software communities don't exist in a vacuum. They're made up of people who are also members of other communities, people who have other interests and engage in other activities. Sometimes these people engage in behaviour outside the community that may be perceived as negatively impacting communities that they're a part of, but most communities have no guidelines for determining whether behaviour outside the community should have any consequences within the community. This post isn't an attempt to provide those guidelines, but aims to provide some things that community leaders should think about when the issue is raised.

Some things to consider

Did the behaviour violate the law?

This seems like an obvious bar, but it turns out to be a pretty bad one. For a start, many things that are common accepted behaviour in various communities may be illegal (eg, reverse engineering work may contravene a strict reading of US copyright law), and taking this to an extreme would result in expelling anyone who's ever broken a speed limit. On the flipside, refusing to act unless someone broke the law is also a bad threshold - much behaviour that communities consider unacceptable may be entirely legal.

There's also the problem of determining whether a law was actually broken. The criminal justice system is (correctly) biased to an extent in favour of the defendant - removing someone's rights in society should require meeting a high burden of proof. However, this is not the threshold that most communities hold themselves to in determining whether to continue permitting an individual to associate with them. An incident that does not result in a finding of criminal guilt (either through an explicit finding or a failure to prosecute the case in the first place) should not be ignored by communities for that reason.

Did the behaviour violate your community norms?

There's plenty of behaviour that may be acceptable within other segments of society but unacceptable within your community (eg, lobbying for the use of proprietary software is considered entirely reasonable in most places, but rather less so at an FSF event). If someone can be trusted to segregate their behaviour appropriately then this may not be a problem, but that's probably not sufficient in all cases. For instance, if someone acts entirely reasonably within your community but engages in lengthy anti-semitic screeds on 4chan, it's legitimate to question whether permitting them to continue being part of your community serves your community's best interests.

Did the behaviour violate the norms of the community in which it occurred?

Of course, the converse is also true - there's behaviour that may be acceptable within your community but unacceptable in another community. It's easy to write off someone acting in a way that contravenes the standards of another community but wouldn't violate your expected behavioural standards - after all, if it wouldn't breach your standards, what grounds do you have for taking action?

But you need to consider that if someone consciously contravenes the behavioural standards of a community they've chosen to participate in, they may be willing to do the same in your community. If pushing boundaries is a frequent trait then it may not be too long until you discover that they're also pushing your boundaries.

Why do you care?

A community's code of conduct can be looked at in two ways - as a list of behaviours that will be punished if they occur, or as a list of behaviours that are unlikely to occur within that community. The former is probably the primary consideration when a community adopts a CoC, but the latter is how many people considering joining a community will think about it.

If your community includes individuals that are known to have engaged in behaviour that would violate your community standards, potential members or contributors may not trust that your CoC will function as adequate protection. A community that contains people known to have engaged in sexual harassment in other settings is unlikely to be seen as hugely welcoming, even if they haven't (as far as you know!) done so within your community. The way your members behave outside your community is going to be seen as saying something about your community, and that needs to be taken into account.

A second (and perhaps less obvious) aspect is that membership of some higher profile communities may be seen as lending general legitimacy to someone, and they may play off that to legitimise behaviour or views that would be seen as abhorrent by the community as a whole. If someone's anti-semitic views (for example) are seen as having more relevance because of their membership of your community, it's reasonable to think about whether keeping them in your community serves the best interests of your community.


I've said things like "considered" or "taken into account" a bunch here, and that's for a good reason - I don't know what the thresholds should be for any of these things, and there doesn't seem to be even a rough consensus in the wider community. We've seen cases in which communities have acted based on behaviour outside their community (eg, Debian removing Jacob Appelbaum after it was revealed that he'd sexually assaulted multiple people), but there's been no real effort to build a meaningful decision making framework around that.

As a result, communities struggle to make consistent decisions. It's unreasonable to expect individual communities to solve these problems on their own, but that doesn't mean we can ignore them. It's time to start coming up with a real set of best practices.

comment count unavailable comments

December 21, 2017 10:09 AM

December 14, 2017

Matthew Garrett: The Intel ME vulnerabilities are a big deal for some people, harmless for most

(Note: all discussion here is based on publicly disclosed information, and I am not speaking on behalf of my employers)

I wrote about the potential impact of the most recent Intel ME vulnerabilities a couple of weeks ago. The details of the vulnerability were released last week, and it's not absolutely the worst case scenario but it's still pretty bad. The short version is that one of the (signed) pieces of early bringup code for the ME reads an unsigned file from flash and parses it. Providing a malformed file could result in a buffer overflow, and a moderately complicated exploit chain could be built that allowed the ME's exploit mitigation features to be bypassed, resulting in arbitrary code execution on the ME.

Getting this file into flash in the first place is the difficult bit. The ME region shouldn't be writable at OS runtime, so the most practical way for an attacker to achieve this is to physically disassemble the machine and directly reprogram it. The AMT management interface may provide a vector for a remote attacker to achieve this - for this to be possible, AMT must be enabled and provisioned and the attacker must have valid credentials[1]. Most systems don't have provisioned AMT, so most users don't have to worry about this.

Overall, for most end users there's little to worry about here. But the story changes for corporate users or high value targets who rely on TPM-backed disk encryption. The way the TPM protects access to the disk encryption key is to insist that a series of "measurements" are correct before giving the OS access to the disk encryption key. The first of these measurements is obtained through the ME hashing the first chunk of the system firmware and passing that to the TPM, with the firmware then hashing each component in turn and storing those in the TPM as well. If someone compromises a later point of the chain then the previous step will generate a different measurement, preventing the TPM from releasing the secret.

However, if the first step in the chain can be compromised, all these guarantees vanish. And since the first step in the chain relies on the ME to be running uncompromised code, this vulnerability allows that to be circumvented. The attacker's malicious code can be used to pass the "good" hash to the TPM even if the rest of the firmware has been tampered with. This allows a sufficiently skilled attacker to extract the disk encryption key and read the contents of the disk[2].

In addition, TPMs can be used to perform something called "remote attestation". This allows the TPM to provide a signed copy of the recorded measurements to a remote service, allowing that service to make a policy decision around whether or not to grant access to a resource. Enterprises using remote attestation to verify that systems are appropriately patched (eg) before they allow them access to sensitive material can no longer depend on those results being accurate.

Things are even worse for people relying on Intel's Platform Trust Technology (PTT), which is an implementation of a TPM that runs on the ME itself. Since this vulnerability allows full access to the ME, an attacker can obtain all the private key material held in the PTT implementation and, effectively, adopt the machine's cryptographic identity. This allows them to impersonate the system with arbitrary measurements whenever they want to. This basically renders PTT worthless from an enterprise perspective - unless you've maintained physical control of a machine for its entire lifetime, you have no way of knowing whether it's had its private keys extracted and so you have no way of knowing whether the attestation attempt is coming from the machine or from an attacker pretending to be that machine.

Bootguard, the component of the ME that's responsible for measuring the firmware into the TPM, is also responsible for verifying that the firmware has an appropriate cryptographic signature. Since that can be bypassed, an attacker can reflash modified firmware that can do pretty much anything. Yes, that probably means you can use this vulnerability to install Coreboot on a system locked down using Bootguard.

(An aside: The Titan security chips used in Google Cloud Platform sit between the chipset and the flash and verify the flash before permitting anything to start reading from it. If an attacker tampers with the ME firmware, Titan should detect that and prevent the system from booting. However, I'm not involved in the Titan project and don't know exactly how this works, so don't take my word for this)

Intel have published an update that fixes the vulnerability, but it's pretty pointless - there's apparently no rollback protection in the affected 11.x MEs, so while the attacker is modifying your flash to insert the payload they can just downgrade your ME firmware to a vulnerable version. Version 12 will reportedly include optional rollback protection, which is little comfort to anyone who has current hardware. Basically, anyone whose threat model depends on the low-level security of their Intel system is probably going to have to buy new hardware.

This is a big deal for enterprises and any individuals who may be targeted by skilled attackers who have physical access to their hardware, and entirely irrelevant for almost anybody else. If you don't know that you should be worried, you shouldn't be.

[1] Although admins should bear in mind that any system that hasn't been patched against CVE-2017-5689 considers an empty authentication cookie to be a valid credential

[2] TPMs are not intended to be strongly tamper resistant, so an attacker could also just remove the TPM, decap it and (with some effort) extract the key that way. This is somewhat more time consuming than just reflashing the firmware, so the ME vulnerability still amounts to a change in attack practicality.

comment count unavailable comments

December 14, 2017 01:31 AM

December 12, 2017

Matthew Garrett: Eben Moglen is no longer a friend of the free software community

(Note: While the majority of the events described below occurred while I was a member of the board of directors of the Free Software Foundation, I am no longer. This is my personal position and should not be interpreted as the opinion of any other organisation or company I have been affiliated with in any way)

Eben Moglen has done an amazing amount of work for the free software community, serving on the board of the Free Software Foundation and acting as its general counsel for many years, leading the drafting of GPLv3 and giving many forceful speeches on the importance of free software. However, his recent behaviour demonstrates that he is no longer willing to work with other members of the community, and we should reciprocate that.

In early 2016, the FSF board became aware that Eben was briefing clients on an interpretation of the GPL that was incompatible with that held by the FSF. He later released this position publicly with little coordination with the FSF, which was used by Canonical to justify their shipping ZFS in a GPL-violating way. He had provided similar advice to Debian, who were confused about the apparent conflict between the FSF's position and Eben's.

This situation was obviously problematic - Eben is clearly free to provide whatever legal opinion he holds to his clients, but his very public association with the FSF caused many people to assume that these positions were held by the FSF and the FSF were forced into the position of publicly stating that they disagreed with legal positions held by their general counsel. Attempts to mediate this failed, and Eben refused to commit to working with the FSF on avoiding this sort of situation in future[1].

Around the same time, Eben made legal threats towards another project with ties to FSF. These threats were based on a license interpretation that ran contrary to how free software licenses had been interpreted by the community for decades, and was made without any prior discussion with the FSF (2017-12-11 update: page 126 of this document includes the email in which Eben asserts that the Software Freedom Conservancy is engaging in plagiarism by making use of appropriately credited material released under a Creative Commons license). This, in conjunction with his behaviour over the ZFS issue, led to him stepping down as the FSF's general counsel.

Throughout this period, Eben disparaged FSF staff and other free software community members in various semi-public settings. In doing so he harmed the credibility of many people who have devoted significant portions of their lives to aiding the free software community. At Libreplanet earlier this year he made direct threats against an attendee - this was reported as a violation of the conference's anti-harassment policy.

Eben has acted against the best interests of an organisation he publicly represented. He has threatened organisations and individuals who work to further free software. His actions are no longer to the benefit of the free software community and the free software community should cease associating with him.

[1] Contrary to the claim provided here, Bradley was not involved in this process.

(Edit to add: various people have asked for more details of some of the accusations here. Eben is influential in many areas, and publicising details without the direct consent of his victims may put them at professional risk. I'm aware that this reduces my credibility, and it's entirely reasonable for people to choose not to believe me as a result. I will add that I said much of this several months ago, so I'm not making stuff up in response to recent events)

comment count unavailable comments

December 12, 2017 05:59 AM

Linux Plumbers Conference: Linux Plumbers Conference 2018 site and dates

We are pleased to announce that the 2018 edition of the Linux Plumbers Conference will take place in Vancouver, British Columbia, Canada at the Sheraton Vancouver Wall Centre. It will be colocated with the Linux Kernel Summit. LPC will run from November 13, 2018 (Tuesday) to November 15, 2018 (Thursday).

We look forward to another great edition of LPC and to seeing you all in Vancouver!

Stay tuned for more information as the Linux Plumbers Conference committee starts planning for the 2018 conference.

The LPC Planning Committee.

December 12, 2017 12:59 AM

December 05, 2017

Pete Zaitcev: Marcan: Debugging an evil Go runtime bug

Fascinating, and a few reactions spring to mind.

First, I have to admit, the resolution simultaneously blew me away and was very nostalgic. Forgetting that some instructions are not atomic is just the thing that I saw people commit in architecture support in kernel (I don't remember if I ever used an opportunity to do it, it's quite possible, even on sun4c).

Also, my (former) colleague DaveJ (who's now consumed by Facebook -- I remember complaints about useful people "gone to Google and never heard from again", but Facebook is the same hole nowadays) once said, approximately: "Everyone loves to crap on Gentoo hackers for silly optimizations and being otherwise unprofessional, but when it's something interesting it's always (or often) them." Gentoo crew is underrated, including their userbase.

And finally:

Go also happens to have a (rather insane, in my opinion) policy of reinventing its own standard library, so it does not use any of the standard Linux glibc code to call vDSO, but rather rolls its own calls (and syscalls too).

Usually you hear about this when their DNS resolver blows up, but it can be elsewhere, as in this case.

(h/t to a chatter in #animeblogger)

UPDATE: CKS adds that some UNIXen officially require applications to use libc.

December 05, 2017 08:20 PM

November 28, 2017

Matthew Garrett: Potential impact of the Intel ME vulnerability

(Note: this is my personal opinion based on public knowledge around this issue. I have no knowledge of any non-public details of these vulnerabilities, and this should not be interpreted as the position or opinion of my employer)

Intel's Management Engine (ME) is a small coprocessor built into the majority of Intel CPU chipsets[0]. Older versions were based on the ARC architecture[1] running an embedded realtime operating system, but from version 11 onwards they've been small x86 cores running Minix. The precise capabilities of the ME have not been publicly disclosed, but it is at minimum capable of interacting with the network[2], display[3], USB, input devices and system flash. In other words, software running on the ME is capable of doing a lot, without requiring any OS permission in the process.

Back in May, Intel announced a vulnerability in the Advanced Management Technology (AMT) that runs on the ME. AMT offers functionality like providing a remote console to the system (so IT support can connect to your system and interact with it as if they were physically present), remote disk support (so IT support can reinstall your machine over the network) and various other bits of system management. The vulnerability meant that it was possible to log into systems with enabled AMT with an empty authentication token, making it possible to log in without knowing the configured password.

This vulnerability was less serious than it could have been for a couple of reasons - the first is that "consumer"[4] systems don't ship with AMT, and the second is that AMT is almost always disabled (Shodan found only a few thousand systems on the public internet with AMT enabled, out of many millions of laptops). I wrote more about it here at the time.

How does this compare to the newly announced vulnerabilities? Good question. Two of the announced vulnerabilities are in AMT. The previous AMT vulnerability allowed you to bypass authentication, but restricted you to doing what AMT was designed to let you do. While AMT gives an authenticated user a great deal of power, it's also designed with some degree of privacy protection in mind - for instance, when the remote console is enabled, an animated warning border is drawn on the user's screen to alert them.

This vulnerability is different in that it allows an authenticated attacker to execute arbitrary code within the AMT process. This means that the attacker shouldn't have any capabilities that AMT doesn't, but it's unclear where various aspects of the privacy protection are implemented - for instance, if the warning border is implemented in AMT rather than in hardware, an attacker could duplicate that functionality without drawing the warning. If the USB storage emulation for remote booting is implemented as a generic USB passthrough, the attacker could pretend to be an arbitrary USB device and potentially exploit the operating system through bugs in USB device drivers. Unfortunately we don't currently know.

Note that this exploit still requires two things - first, AMT has to be enabled, and second, the attacker has to be able to log into AMT. If the attacker has physical access to your system and you don't have a BIOS password set, they will be able to enable it - however, if AMT isn't enabled and the attacker isn't physically present, you're probably safe. But if AMT is enabled and you haven't patched the previous vulnerability, the attacker will be able to access AMT over the network without a password and then proceed with the exploit. This is bad, so you should probably (1) ensure that you've updated your BIOS and (2) ensure that AMT is disabled unless you have a really good reason to use it.

The AMT vulnerability applies to a wide range of versions, everything from version 6 (which shipped around 2008) and later. The other vulnerability that Intel describe is restricted to version 11 of the ME, which only applies to much more recent systems. This vulnerability allows an attacker to execute arbitrary code on the ME, which means they can do literally anything the ME is able to do. This probably also means that they are able to interfere with any other code running on the ME. While AMT has been the most frequently discussed part of this, various other Intel technologies are tied to ME functionality.

Intel's Platform Trust Technology (PTT) is a software implementation of a Trusted Platform Module (TPM) that runs on the ME. TPMs are intended to protect access to secrets and encryption keys and record the state of the system as it boots, making it possible to determine whether a system has had part of its boot process modified and denying access to the secrets as a result. The most common usage of TPMs is to protect disk encryption keys - Microsoft Bitlocker defaults to storing its encryption key in the TPM, automatically unlocking the drive if the boot process is unmodified. In addition, TPMs support something called Remote Attestation (I wrote about that here), which allows the TPM to provide a signed copy of information about what the system booted to a remote site. This can be used for various purposes, such as not allowing a compute node to join a cloud unless it's booted the correct version of the OS and is running the latest firmware version. Remote Attestation depends on the TPM having a unique cryptographic identity that is tied to the TPM and inaccessible to the OS.

PTT allows manufacturers to simply license some additional code from Intel and run it on the ME rather than having to pay for an additional chip on the system motherboard. This seems great, but if an attacker is able to run code on the ME then they potentially have the ability to tamper with PTT, which means they can obtain access to disk encryption secrets and circumvent Bitlocker. It also means that they can tamper with Remote Attestation, "attesting" that the system booted a set of software that it didn't or copying the keys to another system and allowing that to impersonate the first. This is, uh, bad.

Intel also recently announced Intel Online Connect, a mechanism for providing the functionality of security keys directly in the operating system. Components of this are run on the ME in order to avoid scenarios where a compromised OS could be used to steal the identity secrets - if the ME is compromised, this may make it possible for an attacker to obtain those secrets and duplicate the keys.

It's also not entirely clear how much of Intel's Secure Guard Extensions (SGX) functionality depends on the ME. The ME does appear to be required for SGX Remote Attestation (which allows an application using SGX to prove to a remote site that it's the SGX app rather than something pretending to be it), and again if those secrets can be extracted from a compromised ME it may be possible to compromise some of the security assumptions around SGX. Again, it's not clear how serious this is because it's not publicly documented.

Various other things also run on the ME, including stuff like video DRM (ensuring that high resolution video streams can't be intercepted by the OS). It may be possible to obtain encryption keys from a compromised ME that allow things like Netflix streams to be decoded and dumped. From a user privacy or security perspective, these things seem less serious.

The big problem at the moment is that we have no idea what the actual process of compromise is. Intel state that it requires local access, but don't describe what kind. Local access in this case could simply require the ability to send commands to the ME (possible on any system that has the ME drivers installed), could require direct hardware access to the exposed ME (which would require either kernel access or the ability to install a custom driver) or even the ability to modify system flash (possible only if the attacker has physical access and enough time and skill to take the system apart and modify the flash contents with an SPI programmer). The other thing we don't know is whether it's possible for an attacker to modify the system such that the ME is persistently compromised or whether it needs to be re-compromised every time the ME reboots. Note that even the latter is more serious than you might think - the ME may only be rebooted if the system loses power completely, so even a "temporary" compromise could affect a system for a long period of time.

It's also almost impossible to determine if a system is compromised. If the ME is compromised then it's probably possible for it to roll back any firmware updates but still report that it's been updated, giving admins a false sense of security. The only way to determine for sure would be to dump the system flash and compare it to a known good image. This is impractical to do at scale.

So, overall, given what we know right now it's hard to say how serious this is in terms of real world impact. It's unlikely that this is the kind of vulnerability that would be used to attack individual end users - anyone able to compromise a system like this could just backdoor your browser instead with much less effort, and that already gives them your banking details. The people who have the most to worry about here are potential targets of skilled attackers, which means activists, dissidents and companies with interesting personal or business data. It's hard to make strong recommendations about what to do here without more insight into what the vulnerability actually is, and we may not know that until this presentation next month.

Summary: Worst case here is terrible, but unlikely to be relevant to the vast majority of users.

[0] Earlier versions of the ME were built into the motherboard chipset, but as portions of that were incorporated onto the CPU package the ME followedEdit: Apparently I was wrong and it's still on the chipset
[1] A descendent of the SuperFX chip used in Super Nintendo cartridges such as Starfox, because why not
[2] Without any OS involvement for wired ethernet and for wireless networks in the system firmware, but requires OS support for wireless access once the OS drivers have loaded
[3] Assuming you're using integrated Intel graphics
[4] "Consumer" is a bit of a misnomer here - "enterprise" laptops like Thinkpads ship with AMT, but are often bought by consumers.

comment count unavailable comments

November 28, 2017 03:45 AM

November 26, 2017

Michael Kerrisk (manpages): Next Linux/UNIX System Programming course in Munich, 5-9 February, 2018

There are still some places free for my next 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 5-9 February 2018.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., proprietary embedded/realtime operaring systems or Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory), and network programming (sockets).
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive printed and electronic copies of TLPI, along with a 600-page course book that includes all slides presented in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via

November 26, 2017 07:38 PM

Michael Kerrisk (manpages): man-pages-4.14 is released

I've released man-pages-4.14. The release tarball is available on The browsable online pages can be found on The Git repository for man-pages is available on

This release resulted from patches, bug reports, reviews, and comments from 71 contributors. Nearly 400 commits changed more than 160 pages. In addition, 4 new manual pages were added.

Among the more significant changes in man-pages-4.14 are the following:

November 26, 2017 07:14 PM

Michael Kerrisk (manpages): man-pages-4.13 is released

I've released man-pages-4.13. The release tarball is available on The browsable online pages can be found on The Git repository for man-pages is available on

This release resulted from patches, bug reports, reviews, and comments from around 40 contributors. The release is rather larger than average. (The context diff runs to more than 90k lines.) The release includes more than 350 commits and contains some fairly wide-ranging formatting fix-ups that meant that all 1028 existing manual pages saw some change(s). In addition, 5 new manual pages were added.

Among the more significant changes in man-pages-4.13 are the following:

A special thanks to Eugene Syromyatnikov, who contributed 30 patches to this release!

November 26, 2017 10:17 AM

Konstantin Ryabitsev: Spy stuff: symmetric crypto with forward secrecy

We're all impatiently waiting on quantum-proof asymmetrical crypto algorithms to become commonplace. Until that happens, we must all live with the assumption that all currently used asymmetric crypto will be trivially decryptable once quantum computers become powerful enough. This is probably the reason why government agencies have been wholesale logging all encrypted traffic -- doubtless it's done in hopes that eventually it can be decrypted and analyzed.

The only sure way to remain quantum-proof using today's technology is to use symmetric cryptography with good key sizes, but there are huge downsides when it comes to two-way communication using symmetric crypto:

I was pondering how I would go about devising a way to communicate online today that would address some of the above problems, but would also be:

What I came up with fits all of the above, but with the notable caveat that it is not scalable. It's a scheme that can be used between two secret agents, but not something that can be used by run of the mill internauts. Still, I hope it proves to be a fun read if you're a crypto-nerd. :)

The setup

Yubikey nano

You will need two devices capable of doing standard OATH-HOTP. They are extremely cheap and fairly commonplace -- in my example I will be using two original Yubikey Nanos (not the latest 4th gen stuff, but you can use those just as well). You can get them pretty much anywhere, in almost any country, and they are not generally controlled as crypto devices, even though they have a crypto chip. Certainly, finding a techie with a yubikey is not at all suspicious.

The goal is to configure two of these in an identical way. One would be given to Alice, and another to Bob.

Configuring the yubikeys

There are two "slots" on the yubikey where the pre-shared key can go. In my example, I will be configuring the slots in the following way:

While we don't absolutely need to use the static password key in slot 1, it's handy for giving our encryption key extra entropy that will be unrecoverable if the yubikey is destroyed, so why not use it?

Here are the commands:

$ unset HISTFILE
$ ykpersonalize -1 -ostatic-ticket -o-append-cr \
$ ykpersonalize -2 -ooath-hotp -ooath-hotp8 -o-append-cr \
# Remove first key, insert the second, and repeat

What follows the -a are hexadecimal keys that will be used for AES operations inside the yubikey. In the example they are almost the same, but they don't have to be, though whether they are similar or not does not matter much for our needs.

You can generate your random keys on the system where you're doing provisioning, the 16-bit for the static key and the 20-bit for the HOTP key:

xxd -l 16 -p /dev/random
xxd -l 20 -p /dev/random

Do remember that they have to be the same on both yubikeys.

Using the yubikeys to generate symmetric encryption keys

So, now we have two yubikeys configured identically. Alice takes one, and Bob takes the other (and hopefully they don't have photographic memory good enough to remember the 16-bit and 20-bit keys loaded onto the yubikeys). Both of them also agree on an additional shared passphrase that they will use, let's make it "somethingiknow" in our example.

Now what?

HOTP protocol's purpose is to generate one-time passwords by performing a HMAC operation on the pre-shared key stored on the device and the keypress event counter. Normally, a user would press a button on the device, which will cause it to increment the internal event counter, and then perform a simple math operation to calculate the combined passphrase+counter HMAC. It is then truncated and represented as a 6- or 8-digit one-time code. HOTP is designed in a way such that knowing any number of previously generated one-time codes would still make it impossible to predict the codes generated in the future (or those generated in the past, which is equally important for our use-case).

We will be using this to generate one-time encryption passwords that would allow Alice and Bob to generate identical encryption/decryption keys without having to coordinate them with each other. Each encryption key will consist of:

If you're curious, starting with the HOTP counter at 0 and using the keys in the above examples, this will generate the following string:


We can use more than 3 HOTP codes if we want to increase the key entropy. We don't want to use a single HOTP code because this would make the encryption key easily brute-forceable if Eve finds out about Alice and Bob's scheme and manages to get her hands on the yubikey (e.g. Bob didn't have time to flush it down the toilet).

Using 3 or 4 keypresses should make brute-forcing decryption unfeasible -- it will be cheaper (but still really expensive) to get the keys from the yubikey's crypto chip.

Avoiding key desync

One of the obvious downsides to this method would be if Alice and Bob's yubikeys become desynchronized. Say, Bob fumbles his key and accidentally fires off the HOTP keypress, therefore incrementing the counter by 1. If that happens, identical operations done by Alice and Bob will generate different encryption passwords and the scheme would break down.

This is one of the reasons why the HOTP code is in the 2nd slot, which fires only after 2 seconds of being touched continuously. Even with that precaution, accidental misfires will still undoubtedly happen, which is why the agents need to pass along their counter information to each-other with each message, in cleartext (remember, knowing the value of HOTP code does not give any useful information about the key, nor weakens future generated codes, so this is a safe operation).

Easiest would be for each agent to perform 5-6 intentional discards before each message, and then pass the HOTP code immediately preceding the encryption key generation to the other agent in cleartext.

For example, if Alice's counter is at 3 and she needs to communicate with Bob, she does several long keypresses to discard a few HOTP codes, and then uses the last one to pass it along in cleartext in the message to Bob, before using the following 3 HOTP codes as part of the encryption key.

When Bob receives that message, he locates the HOTP code in cleartext and discards his own HOTP codes until the one returned by his key matches the one in cleartext from Alice. He now knows that the next 3 HOTP codes will be the ones used by Alice to generate the encryption key. This scheme allows the agents to synchronize their keys before each decryption -- as long as the HOTP counter desync between the keys remains minor (as would be expected from accidental keypresses).

Example with GnuPG

We assume that both Alice and Bob use an amnesiac system like Tails for their communication. If Alice needs to send Bob a message, she would boot up Tails, write out the message into message.txt and then perform the encryption operation (first discarding a few tokens for the reasons mentioned above, until she gets to 51951508, which she sticks into the comment field of the PGP header):

$ echo "somethingiknowlnrvntetfedgrgtfbhudrljbgrtrunjh629455431454256015151006" \
  | gpg2 -c -a --batch --passphrase-fd 0 --comment '51959508' message.txt

This results in the following output:

Comment: 51959508


She then posts it in some pre-arranged public resource (pastebin, a public forum, etc) or sends it via regular email or chat software that would look inconspicuous and indistinguishable from her regular internet use (Gmail, Whatsapp, Facebook, etc).

Bob (also using Tails, of course), receives the communication and saves it into message.txt.asc. He then starts discarding his HOTP codes until he gets to 51959508, and then generates the identical decryption password using his yubikey and his remembered passphrase:

$ echo "somethingiknowlnrvntetfedgrgtfbhudrljbgrtrunjh629455431454256015151006" \
  | gpg2 -d --batch --passphrase-fd 0 message.txt.asc

Example with OpenStego

It's also possible to use OpenStego to conceal your message using steganography. To generate an encrypted payload and conceal it inside a PNG image, Alice can do the following (I am using the official 150-anniversary of Canada picture of the Queen, which I converted to png):

$ ./ embed -mf message.txt -cf thequeen.png -sf 51959508.png -e \
  -p "somethingiknowlnrvntetfedgrgtfbhudrljbgrtrunjh629455431454256015151006"

She then uploads the resulting file to some image sharing service that Bob monitors (here it is below).

Her majester

To get the contents of the message, Bob would then download the original image (just make sure the service doesn't optimize/recompress the image files -- or if it does, that it always preserves the originals). He then uses the name of the file to synchronize his HOTP counter and generates the password needed to get the message back out of the concealed file:

$ ./ extract -sf 51959508.png \
  -p "somethingiknowlnrvntetfedgrgtfbhudrljbgrtrunjh629455431454256015151006"

Try it out yourself with the picture seen above, it's pretty cool. :)

Why this method is awesome

As stated above, this method has the following things going for it:

The downsides

As I said, this is "spy games" kind of stuff -- you have to a be a very dedicated and paranoid person to stick to this kind of protocol. It is good for one-to-one or one-to-many communication, but would quickly break down if attempted for group messaging due to inevitable HOTP counter desync. That said, it can probably be used to establish time-limited group chat sessions that are rekeyed daily.

If anything, this is just a mental paranoia exercise -- it was fun to do, but I certainly hope never to be in a situation where I have to worry about needing to use anything like it.

Thanks for reading, and please feel free to email me at if you have any comments.

November 26, 2017 04:00 AM

November 25, 2017

Linux Plumbers Conference: Audio Recordings Posted

This year, by way of an experiment we tried recording the audio through the sound system of the talks track and one Microconference track (Those in Platinum C).  Unfortunately, because of technical problems, we have no recordings from Wednesday, but mostly complete ones from Thursday and Friday (Missing TPM Software Stack Status and Managing the Impact of Growing CPU Register State).

To find the audio, go to the full description of the talk or Microconference (click on the title) and scroll down to the bottom of the Abstract (just before the Tags section).  The audio is downloadable mp3, so you can either stream directly to your browser or download for later offline listening.

If you find the audio useful (or not), please let us know ( so we can plan for doing it again next year.

November 25, 2017 03:30 PM

November 24, 2017

Paul E. Mc Kenney: Parallel Programming: November 2017 Update

This USA Thanksgiving holiday weekend features a new release of Is Parallel Programming Hard, And, If So, What Can You Do About It?.

This update includes more formatting and build-system improvements, bibliography updates, and better handling of listings, all courtesy of Akira Yokosawa; numerous fixes and updates from Junchang Wang, Pierre Kuo, SeongJae Park, and Yubin Ruan; a new futures section on quantum computing; updates to the formal-verification section based on recent collaborations; and a full rewrite of the memory-barriers section, which is now its own chapter. This rewrite was of course based on recent work with my partners in memory-ordering crime, Jade Alglave, Luc Maranget, Andrea Parri, and Alan Stern.

As always, git:// will be updated in real time.

November 24, 2017 11:21 PM

November 20, 2017

Davidlohr Bueso: Linux v4.14: Performance Goodies

Last week Linus released the v4.14 kernel with some noticeable performance changes. The following is an unsorted and incomplete list of changes that went in. Note that the term 'performance' can be vague in that some gains in one area can negatively affect another, so take everything with a grain of salt and reach your own conclusions.

sysvipc: scale key management

We began using relativistic hash tables for managing ipc keys, which greatly improves the current O(N) lookups. As such, ipc_findkey() calls are significantly faster (+800% in some reaim file benchmarks) and we need not iterate all elements each time. Improvements are even seen in  scenarios where the amount of keys is but a handful, so this is pretty much a win from any standpoint.
[Commit 0cfb6aee70bd]

interval-tree: fast overlap detection

With the new extended rbtree api to cache the smallest (leftmost) node, instead of doing O(logN) walks to the end of the tree, we have the pointer always available. This allows to extend and complete the fast overlap detection for interval trees to speedup (sub)tree searches if the interval is completely to the left or right of the current tree's max interval. In addition, a number of other users that traverse rbtrees are updated to use the new rbtree_cached, such as epoll, procfs and cfq.
[Commit  cd9e61ed1eeb, 410bd5ecb276, 2554db916586, b2ac2ea6296af808c13fd373]

sched: waitqueue bookmarks

A situation where constant NUMA migrations of a hot-page triggered large number of page waiters being awoken exhibited some issues in the waitqueue implementation. In such cases, large number of wakeups will occur while holding a spinlock, which causes significant unbounded lantencies. Unlike wake_qs (used in futexes and locks), where batched wakeups are done without the lock, waitqueue bookmarks allow to to pause and stop iterating the wake list such that another process has a chance to acquire the lock. Then it can resume where it left off.
[Commit 3510ca20ece, 2554db916586, 11a19c7b099f]

 x86 PCID (Process Context Identifier)

This is a 64-bit hardware feature that allows tagging TLBs such that upon context switching, only flush the required entries. For virtualization (VT-x) this has supported similar features for a while, via vpid. On other archs it is called address space ID. Linux's support is somewhat special. In order to avoid the x86 limitations of 4096 IDs (or processes), the implementation actually uses a PCID to identify a recently-used mm (process address space) on a per-cpu basis. An mm has no fixed PCID binding at all; instead, it is given a fresh PCID each time it's loaded, except in cases where we want to preserve the TLB, in which case we reuse a recent value. To illustrate, a workload under kvm that ping pongs two processes, dTLB misses were reduced by ~17x.
[Commit f39681ed0f48, b0579ade7cd8, 94b1b03b519b, 43858b4f25cf, cba4671af7550790c9aad849, 660da7c9228f, 10af6235e0d3


ORC (Oops Rewind Capability) Unwinder

The much acclaimed replacement to frame pointers and the (out of tree) DWARF unwinder. Through simplicity, the end result is faster profiling, such as for perf. Experiments show a 20x performance increase using ORC vs DWARF while calling save_stack_trace 20,000 times via single vfs_write. With respect to frame pointers, the ORC unwinder is more accurate across interrupt entry frames and enables a 5-10% performance improvement across the entire kernel compared to frame pointers.
[Commit ee9f8fce9964, 39358a033b2e]

mm: choose swap device according to numa node

If the system has more than one swap device and swap device has the node information, we can make use of this information to decide which swap device to use in get_swap_pages() to get better performance. This change replaces a single global swap_avail list with a per-numa-node list: each numa node sees its own priority based list of available swap devices. Swap device's priority can be promoted on its matching node's swap_avail_list. Shows ~25% improvements for a 2 node box, benchmaring random writes on mmaped region withSSDs attached to each node, ensuring swapping in and out.
[Commit a2468cc9bfdf]

mm: reduce cost of page allocator

Upon page allocation, the per-zone statistics are updated, introducing overhead in the form of cacheline bouncing; responsible for ~30% of all CPU cycles  for allocating a single page. The networking folks have been known to complain about the performance degradation when dealing with the memory management subsystem, particularly the page allocator. The fact that these NUMA associated counters are rarely used allows the counter threshold that determines the frequency of updating the global counter with the percpu counters (hence cacheline bouncing) to be increased. This means hurting readers, but that's the point.
[Commit 3a321d2a3dde, 1d90ca897cb0, 638032224ed7]

archs: multibyte memset

New calls memset16(), memset32() and memset64() are introduced, which are like memset(), but allow the caller to fill the destination with a value larger than a single byte. There are a number of places in the kernel that can benefit from using an optimized function rather than a loop; sometimes text size, sometimes speed, and sometimes both. When supported by the architecture, use a single instruction, such as stosq (stores a quadword) in x86-64. Zram shows a 7% performance improvement on x86 with a 100Mb non-zero deduplicate data. If not available, default back to the slower loop implementation.
[Commits  3b3c4babd898, 03270c13c5ff, 4c51248533ad, 48ad1abef402]

powerpc: improve TLB flushing

A few optimisations were also added to the radix MMU TLB flushing, mostly to avoid unnecessary Page Walk Cache (PWC) flushes when the structure of the tree is not changing.
[Commit a46cc7a90fd8, 424de9c6e3f8]

There are plenty of other performance optimizations out there, including ext4 parallel file creation and quotas, additional memset improvements in sparc, transparent hugepage migrations and swap improvements, ipv6 (ip6_route_output()) optimizations, etc. Again, the list here is partial and biased by me. For more list of features play with 'git log' or visit lwn (part1, part2) and kernelnewbies.

November 20, 2017 03:50 PM

November 15, 2017

Kees Cook: security things in Linux v4.14

Previously: v4.13.

Linux kernel v4.14 was released this last Sunday, and there’s a bunch of security things I think are interesting:

vmapped kernel stack on arm64
Similar to the same feature on x86, Mark Rutland and Ard Biesheuvel implemented CONFIG_VMAP_STACK for arm64, which moves the kernel stack to an isolated and guard-paged vmap area. With traditional stacks, there were two major risks when exhausting the stack: overwriting the thread_info structure (which contained the addr_limit field which is checked during copy_to/from_user()), and overwriting neighboring stacks (or other things allocated next to the stack). While arm64 previously moved its thread_info off the stack to deal with the former issue, this vmap change adds the last bit of protection by nature of the vmap guard pages. If the kernel tries to write past the end of the stack, it will hit the guard page and fault. (Testing for this is now possible via LKDTM’s STACK_GUARD_PAGE_LEADING/TRAILING tests.)

One aspect of the guard page protection that will need further attention (on all architectures) is that if the stack grew because of a giant Variable Length Array on the stack (effectively an implicit alloca() call), it might be possible to jump over the guard page entirely (as seen in the userspace Stack Clash attacks). Thankfully the use of VLAs is rare in the kernel. In the future, hopefully we’ll see the addition of PaX/grsecurity’s STACKLEAK plugin which, in addition to its primary purpose of clearing the kernel stack on return to userspace, makes sure stack expansion cannot skip over guard pages. This “stack probing” ability will likely also become directly available from the compiler as well.

set_fs() balance checking
Related to the addr_limit field mentioned above, another class of bug is finding a way to force the kernel into accidentally leaving addr_limit open to kernel memory through an unbalanced call to set_fs(). In some areas of the kernel, in order to reuse userspace routines (usually VFS or compat related), code will do something like: set_fs(KERNEL_DS); ...some code here...; set_fs(USER_DS);. When the USER_DS call goes missing (usually due to a buggy error path or exception), subsequent system calls can suddenly start writing into kernel memory via copy_to_user (where the “to user” really means “within the addr_limit range”).

Thomas Garnier implemented USER_DS checking at syscall exit time for x86, arm, and arm64. This means that a broken set_fs() setting will not extend beyond the buggy syscall that fails to set it back to USER_DS. Additionally, as part of the discussion on the best way to deal with this feature, Christoph Hellwig and Al Viro (and others) have been making extensive changes to avoid the need for set_fs() being used at all, which should greatly reduce the number of places where it might be possible to introduce such a bug in the future.

SLUB freelist hardening
A common class of heap attacks is overwriting the freelist pointers stored inline in the unallocated SLUB cache objects. PaX/grsecurity developed an inexpensive defense that XORs the freelist pointer with a global random value (and the storage address). Daniel Micay improved on this by using a per-cache random value, and I refactored the code a bit more. The resulting feature, enabled with CONFIG_SLAB_FREELIST_HARDENED, makes freelist pointer overwrites very hard to exploit unless an attacker has found a way to expose both the random value and the pointer location. This should render blind heap overflow bugs much more difficult to exploit.

Additionally, Alexander Popov implemented a simple double-free defense, similar to the “fasttop” check in the GNU C library, which will catch sequential free()s of the same pointer. (And has already uncovered a bug.)

Future work would be to provide similar metadata protections to the SLAB allocator (though SLAB doesn’t store its freelist within the individual unused objects, so it has a different set of exposures compared to SLUB).

setuid-exec stack limitation
Continuing the various additional defenses to protect against future problems related to userspace memory layout manipulation (as shown most recently in the Stack Clash attacks), I implemented an 8MiB stack limit for privileged (i.e. setuid) execs, inspired by a similar protection in grsecurity, after reworking the secureexec handling by LSMs. This complements the unconditional limit to the size of exec arguments that landed in v4.13.

randstruct automatic struct selection
While the bulk of the port of the randstruct gcc plugin from grsecurity landed in v4.13, the last of the work needed to enable automatic struct selection landed in v4.14. This means that the coverage of randomized structures, via CONFIG_GCC_PLUGIN_RANDSTRUCT, now includes one of the major targets of exploits: function pointer structures. Without knowing the build-randomized location of a callback pointer an attacker needs to overwrite in a structure, exploits become much less reliable.

structleak passed-by-reference variable initialization
Ard Biesheuvel enhanced the structleak gcc plugin to initialize all variables on the stack that are passed by reference when built with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL. Normally the compiler will yell if a variable is used before being initialized, but it silences this warning if the variable’s address is passed into a function call first, as it has no way to tell if the function did actually initialize the contents. So the plugin now zero-initializes such variables (if they hadn’t already been initialized) before the function call that takes their address. Enabling this feature has a small performance impact, but solves many stack content exposure flaws. (In fact at least one such flaw reported during the v4.15 development cycle was mitigated by this plugin.)

improved boot entropy
Laura Abbott and Daniel Micay improved early boot entropy available to the stack protector by both moving the stack protector setup later in the boot, and including the kernel command line in boot entropy collection (since with some devices it changes on each boot).

eBPF JIT for 32-bit ARM
The ARM BPF JIT had been around a while, but it didn’t support eBPF (and, as a result, did not provide constant value blinding, which meant it was exposed to being used by an attacker to build arbitrary machine code with BPF constant values). Shubham Bansal spent a bunch of time building a full eBPF JIT for 32-bit ARM which both speeds up eBPF and brings it up to date on JIT exploit defenses in the kernel.

seccomp improvements
Tyler Hicks addressed a long-standing deficiency in how seccomp could log action results. In addition to creating a way to mark a specific seccomp filter as needing to be logged with SECCOMP_FILTER_FLAG_LOG, he added a new action result, SECCOMP_RET_LOG. With these changes in place, it should be much easier for developers to inspect the results of seccomp filters, and for process launchers to generate logs for their child processes operating under a seccomp filter.

Additionally, I finally found a way to implement an often-requested feature for seccomp, which was to kill an entire process instead of just the offending thread. This was done by creating the SECCOMP_RET_ACTION_FULL mask (née SECCOMP_RET_ACTION) and implementing SECCOMP_RET_KILL_PROCESS.

That’s it for now; please let me know if I missed anything. The v4.15 merge window is now open!

© 2017 – 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

November 15, 2017 05:23 AM

November 14, 2017

James Morris: Save the Dates: Linux Security Summit Events for 2018

There will be a new European version of the Linux Security Summit for 2018, in addition to the established North American event.

The dates and locations are as follows:

Stay tuned for CFP announcements!


November 14, 2017 11:24 PM

November 13, 2017

Gustavo F. Padovan: The linuxdev-br conference was a success!

Last Saturday we had the first edition of the Linux Developer Conference Brazil. A conference  born from the need of a meeting point, in Brazil, for the developers,  enthusiasts and companies of FOSS projects that forms the Core of modern Linux systems, either it be in smartphones, cloud, cars or TVs.

After a few years traveling to conferences around the world I felt that we didn’t have in Brazil any forum like the ones outside of Brazil, so I came up with the idea of building one myself. So I invited two friends of mine to take on the challenge, Bruno Dilly and João Moreira. We also got help from University of Campinas that allowed us to use their space, many thanks to Professor Islene Garcia.

Together we made linuxdev-br was a success, the talks were great. Almost 100 people attended the conference, some of them traveling from quite far places in Brazil. During the day we had João Avelino Bellomo Filho talking about SystemTap, Lucas Villa Real talking about Virtualization with GoboLinux’ Runner and Felipe Neves talking about the Zephyr project. In the afternoon we had Fabio Estevam talking about Device Tree, Arnaldo Melo on perf tools and João Moreira on Live Patching. All videos are available here (in Portuguese).

To finish the day we had a Happy Hour paid by the sponsors of the conference. It was a great opportunity to have some beers and interact with other attendees.

I want to thank you everyone that joined us in the first edition, next year it will be even better. By the way, talking about next year, the conference idiom next year will be English. We want linuxdev-br to become part of the international cycle of conferences! Stay tuned for next year, if you want to take part, talk or sponsor please reach us at

November 13, 2017 03:33 PM

November 07, 2017

Dave Airlie (blogspot): radv on Ubuntu broken in distro packages

It appears that Ubuntu mesa 17.2.2 packages that ship radv, have patches to enable MIR support. These patches actually just break radv instead. I'd seen some people complain that simple apps don't work on radv, and saying radv wasn't ready for use and how could anyone thing of using it and just wondered what they had been smoking as Fedora was working fine. Hopefully Canonical can sort that out ASAP.

November 07, 2017 07:35 PM