Kernel Planet

September 14, 2019

Matthew Garrett: It's time to talk about post-RMS Free Software

Richard Stallman has once again managed to demonstrate incredible insensitivity[1]. There's an argument that in a pure technical universe this is irrelevant and we should instead only consider what he does in free software[2], but free software isn't a purely technical topic - the GNU Manifesto is nakedly political, and while free software may result in better technical outcomes it is fundamentally focused on individual freedom and will compromise on technical excellence if otherwise the result would be any compromise on those freedoms. And in a political movement, there is no way that we can ignore the behaviour and beliefs of that movement's leader. Stallman is driving away our natural allies. It's inappropriate for him to continue as the figurehead for free software.

But I'm not calling for Stallman to be replaced. If the history of social movements has taught us anything, it's that tying a movement to a single individual is a recipe for disaster. The FSF needs a president, but there's no need for that person to be a leader - instead, we need to foster an environment where any member of the community can feel empowered to speak up about the importance of free software. A decentralised movement about returning freedoms to individuals can't also be about elevating a single individual to near-magical status. Heroes will always end up letting us down. We fix that by removing the need for heroes in the first place, not attempting to find increasingly perfect heroes.

Stallman was never going to save us. We need to take responsibility for saving ourselves. Let's talk about how we do that.

[1] There will doubtless be people who will leap to his defense with the assertion that he's neurodivergent and all of these cases are consequences of that.

(A) I am unaware of a formal diagnosis of that, and I am unqualified to make one myself. I suspect that basically everyone making that argument is similarly unqualified.
(B) I've spent a lot of time working with him to help him understand why various positions he holds are harmful. I've reached the conclusion that it's not that he's unable to understand, he's just unwilling to change his mind.

[2] This argument is, obviously, bullshit

comment count unavailable comments

September 14, 2019 11:57 AM

September 10, 2019

Davidlohr Bueso: Linux v5.2: Performance Goodies

locking/rwsem: optimize trylocking for the uncontended case

This applies the idea that in most cases, a rwsem will be uncontended (single threaded). For example, experimentation showed that page fault paths really expect this. The change itself makes the code basically not read in a cacheline in a tight loop over and over. Note however that this can be a double edged sword, as microbenchmarks have show performance deterioration upon high amounts of tasks, albeit mainly pathological workloads.
[Commit ddb20d1d3aed a338ecb07a33]

lib/lockref: limit number of cmpxchg loop retries

Unbounded loops are rather froned upon, specially ones ones doing CAS operations. As such, Linus suggested adding an arbitrary upper bound to the loop to force the slowpath (spinlock fallback), which was seen to improve performance on an adhoc testcase on hardware that incurrs in the loop retry game.
[Commit 893a7d32e8e0]
 

rcu: avoid unnecessary softirqs when system is idle

Upon an idle system with no pending callbacks, rcu sofirqs to process callbacks were being triggered repeatedly. Specifically the mismatch between cpu_no_qs and core_need_rq was addressed.
[Commit 671a63517cf9]

rcu: fix potential cond_resched() slowdowns

When using the jiffies_till_sched_qs kernel boot parameter, a bug made jiffies_to_sched_qs become uinitialized as zero and therefore impacts negatively in cond_resched().
[Commit 6973032a602e]

mm: improve vmap allocation

Doing a vmalloc can be quite slow at times, and with it being done with preemption disabled, can affect workloads that are sensible to this. The problem relies in the fact that a new VA area is done over a busy list iteration until a suitable hole is found between two busy areas. The changes propose the always reliable red-black tree to keep blocks sorted by their offsets along with a list keeping the free space in order of increasing addresses.
[Commit 68ad4a330433 68571be99f32]


mm/gup: safe usage of get_user_pages_fast() with DAX

Users of get_user_pages_fast() have potential performance benefits compared to its non-fast cousin, by avoiding mmap_sem, than it's non-fast equivalent. However drivers such as rdma can pin these pages for a significant amount of time, where a number of issues come with the filesystem as referenced pages will block a number of critical operations and is known to mess up DAX. A new FOLL_LONGTERM flag is added and checked accordingly; which also means that other users such as xdp can now also be converted to gup_fast.
[Commit 932f4a630a69 b798bec4741b 73b0140bf0fe 7af75561e171 9fdf4aa15673 664b21e717cf f3b4fdb18cb5 ]

lib/sort: faster and smaller

Because CONFIG_RETPOLINE has made indirect calls much more expensive, these changes reduce the number made by the library sort functions, lib/sort and lib/list_sort. A number of optimizations and clever tricks are used such as a more efficient bottom up heapsort and playing nicer with store buffers.
[Commit 37d0ec34d111 22a241ccb2c1 8fb583c4258d 043b3f7b6388 b5c56e0cdd62]

ipc/mqueue: make msg priorities truly O(1)

By keeping the pointer to the tree's rightmost node, the process of consuming a message can be done in constant time, instead of logarithmic.
[Commit a5091fda4e3c]

x86/fpu: load FPU registers on return to userland

This is a large, 27-patch, cleanup and optimization to only load fpu registers on return to userspace, instead of upon every context switch. This means that tasks that remain in kernel space do not load the registers. Accessing the fpu registers in the kernel requires disabling preemption and bottom-halfs  for scheduler and softirqs, accordingly.

x86/hyper-v: implement EOI optimization

Avoid a vmexit on EOI. This was seen to slightly improve IOPS when testing nvme disks with raid and ext4.
[Commit ba696429d290]
 

btrfs: improve performance on fsync of files with multiple hardlinks

A fix to a performance regression seen in pgbench which can make fsync a full transaction commit in order to avoid losing hard links and new ancestors of the fsynced inode.
[Commit b8aa330d2acb]

fsnotify: fix unlink performance regression

This restores an unlink performance optimization that avoids take_dentry_name_snapshot().
[Commit 4d8e7055a405]

block/bfq: do not merge queues on flash storage with queuing

Disable queue merging on non-rotational devices with internal queueing, thus boosting throughput on interleaved IO.
[Commit 8cacc5ab3eac]

September 10, 2019 07:26 PM

September 08, 2019

James Bottomley: The Mythical Economic Model of Open Source

It has become fashionable today to study open source through the lens of economic benefits to developers and sometimes draw rather alarming conclusions. It has also become fashionable to assume a business model tie and then berate the open source community, or their licences, for lack of leadership when the business model fails. The purpose of this article is to explain, in the first part, the fallacy of assuming any economic tie in open source at all and, in the second part, go on to explain how economics in open source is situational and give an overview of some of the more successful models.

Open Source is a Creative Intellectual Endeavour

All the creative endeavours of humanity, like art, science or even writing code, are often viewed as activities that produce societal benefit. Logically, therefore, the people who engage in them are seen as benefactors of society, but assuming people engage in these endeavours purely to benefit society is mostly wrong. People engage in creative endeavours because it satisfies some deep need within themselves to exercise creativity and solve problems often with little regard to the societal benefit. The other problem is that the more directed and regimented a creative endeavour is, the less productive its output becomes. Essentially to be truly creative, the individual has to be free to pursue their own ideas. The conundrum for society therefore is how do you harness this creativity for societal good if you can’t direct it without stifling the very creativity you want to harness? Obviously society has evolved many models that answer this (universities, benefactors, art incubation programmes, museums, galleries and the like) with particular inducements like funding, collaboration, infrastructure and so on.

Why Open Source development is better than Proprietary

Simply put, the Open Source model, involving huge freedoms to developers to decide direction and great opportunities for collaboration stimulates the intellectual creativity of those developers to a far greater extent than when you have a regimented project plan and a specific task within it. The most creatively deadening job for any engineer is to find themselves strictly bound within the confines of a project plan for everything. This, by the way, is why simply allowing a percentage of paid time for participating in Open Source seems to enhance input to proprietary projects: the liberated creativity has a knock on effect even in regimented development. However, obviously, the goal for any Corporation dependent on code development should be to go beyond the knock on effect and actually employ open source methodologies everywhere high creativity is needed.

What is Open Source?

Open Source has it’s origin in code sharing models, permissive from BSD and reciprocal from GNU. However, one of its great values is the reasons why people do open source aren’t the same reasons why the framework was created in the first place. Today Open Source is a framework which stimulates creativity among developers and helps them create communities, provides economic benefits to corportations (provided they understand how to harness them) and produces a great societal good in general in terms of published reusable code.

Economics and Open Source

As I said earlier, the framework of Open Source has no tie to economics, in the same way things like artistic endeavour don’t. It is possible for a great artist to make money (as Picasso did), but it’s equally possible for a great artist to live all their lives in penury (as van Gough did). The demonstration of the analogy is that trying to measure the greatness of the art by the income of the artist is completely wrong and shortsighted. Developing the ability to exploit your art for commercial gain is an additional skill an artist can develop (or not, as they choose) it’s also an ability they could fail in and in all cases it bears no relation to the societal good their art produces. In precisely the same way, finding an economic model that allows you to exploit open source (either individually or commercially) is firstly a matter of choice (if you have other reasons for doing Open Source, there’s no need to bother) and secondly not a guarantee of success because not all models succeed. Perhaps the easiest way to appreciate this is through the lens of personal history.

Why I got into Open Source

As a physics PhD student, I’d always been interested in how operating systems functioned, but thanks to the BSD lawsuit and being in the UK I had no access to the actual source code. When Linux came along as a distribution in 1992, it was a revelation: not only could I read the source code but I could have a fully functional UNIX like system at home instead of having to queue for time to write up my thesis in TeX on the limited number of department terminals.

After completing my PhD I was offered a job looking after computer systems in the department and my first success was shaving a factor of ten off the computing budget by buying cheap pentium systems running Linux instead of proprietary UNIX workstations. This success was nearly derailed by an NFS bug in Linux but finding and fixing the bug (and getting it upstream into the 1.0.2 kernel) cemented the budget savings and proved to the department that we could handle this new technology for a fraction of the cost of the old. It also confirmed my desire to poke around in the Operating System which I continued to do, even as I moved to America to work on Proprietary software.

In 2000 I got my first Open Source break when the product I’d been working on got sold to a silicon valley startup, SteelEye, whose business plan was to bring High Availability to Linux. As the only person on the team with an Open Source track record, I became first the Architect and later CTO of the company, with my first job being to make the somewhat eccentric Linux SCSI subsystem work for the shared SCSI clusters LifeKeeper then used. Getting SCSI working lead to fund interactions with the Linux community, an Invitation to present on fixing SCSI to the Kernel Summit in 2002 and the maintainership of SCSI in 2003. From that point, working on upstream open source became a fixture of my Job requirements but progressing through Novell, Parallels and now IBM it also became a quality sought by employers.

I have definitely made some money consulting on Open Source, but it’s been dwarfed by my salary which does get a boost from my being an Open Source developer with an external track record.

The Primary Contributor Economic Models

Looking at the active contributors to Open Source, the primary model is that either your job description includes working on designated open source projects so you’re paid to contribute as your day job
or you were hired because of what you’ve already done in open source and contributing more is a tolerated use of your employer’s time, a third, and by far smaller group is people who work full-time on Open Source but fund themselves either by shared contributions like patreon or tidelift or by actively consulting on their projects. However, these models cover existing contributors and they’re not really a route to becoming a contributor because employers like certainty so they’re unlikely to hire someone with no track record to work on open source, and are probably not going to tolerate use of their time for developing random open source projects. This means that the route to becoming a contributor, like the route to becoming an artist, is to begin in your own time.

Users versus Developers

Open Source, by its nature, is built by developers for developers. This means that although the primary consumers of open source are end users, they get pretty much no say in how the project evolves. This lack of user involvement has been lamented over the years, especially in projects like the Linux Desktop, but no real community solution has ever been found. The bottom line is that users often don’t know what they want and even if they do they can’t put it in technical terms, meaning that all user driven product development involves extensive and expensive product research which is far beyond any open source project. However, this type of product research is well within the ability of most corporations, who can also afford to hire developers to provide input and influence into Open Source projects.

Business Model One: Reflecting the Needs of Users

In many ways, this has become the primary business model of open source. The theory is simple: develop a traditional customer focussed business strategy and execute it by connecting the gathered opinions of customers to the open source project in exchange for revenue for subscription, support or even early shipped product. The business value to the end user is simple: it’s the business value of the product tuned to their needs and the fact that they wouldn’t be prepared to develop the skills to interact with the open source developer community themselves. This business model starts to break down if the end users acquire developer sophistication, as happens with Red Hat and Enterprise users. However, this can still be combatted by making sure its economically unfeasible for a single end user to match the breadth of the offering (the entire distribution). In this case, the ability of the end user to become involved in individual open source projects which matter to them is actually a better and cheaper way of doing product research and feeds back into the synergy of this business model.

This business model entirely breaks down when, as in the case of the cloud service provider, the end user becomes big enough and technically sophisticated enough to run their own distributions and sees doing this as a necessary adjunct to their service business. This means that you can no-longer escape the technical sophistication of the end user by pursuing a breadth of offerings strategy.

Business Model Two: Drive Innovation and Standardization

Although venture capitalists (VCs) pay lip service to the idea of constant innovation, this isn’t actually what they do as a business model: they tend to take an innovation and then monetize it. The problem is this model doesn’t work for open source: retaining control of an open source project requires a constant stream of innovation within the source tree itself. Single innovations get attention but unless they’re followed up with another innovation, they tend to give the impression your source tree is stagnating, encouraging forks. However, the most useful property of open source is that by sharing a project and encouraging contributions, you can obtain a constant stream of innovation from a well managed community. Once you have a constant stream of innovation to show, forking the project becomes much harder, even for a cloud service provider with hundreds of developers, because they must show they can match the innovation stream in the public tree. Add to that Standardization which in open source simply means getting your project adopted for use by multiple consumers (say two different clouds, or a range of industry). Further, if the project is largely run by a single entity and properly managed, seeing the incoming innovations allows you to recruit the best innovators, thus giving you direct ownership of most of the innovation stream. In the early days, you make money simply by offering user connection services as in Business Model One, but the ultimate goal is likely acquisition for the talent possesed, which is a standard VC exit strategy.

All of this points to the hypothesis that the current VC model is wrong. Instead of investing in people with the ideas, you should be investing in people who can attract and lead others with ideas

Other Business Models

Although the models listed above have proven successful over time, they’re by no means the only possible ones. As the space of potential business models gets explored, it could turn out they’re not even the best ones, meaning the potential innovation a savvy business executive might bring to open source is newer and better business models.

Conclusions

Business models are optional extras with open source and just because you have a successful open source project does not mean you’ll have an equally successful business model unless you put sufficient thought into constructing and maintaining it. Thus a successful open source start up requires three elements: A sound business model, or someone who can evolve one, a solid community leader and manager and someone with technical ability in the problem space.

If you like working in Open Source as a contributor, you don’t necessarily have to have a business model at all and you can often simply rely on recognition leading to opportunities that provide sufficient remuneration.

Although there are several well known business models for exploiting open source, there’s no reason you can’t create your own different one but remember: a successful open source project in no way guarantees a successful business model.

September 08, 2019 09:35 AM

September 04, 2019

Linux Plumbers Conference: LPC waiting list closed; just a few days until the conference

The waiting list for this year’s Linux Plumbers Conference is now closed. All of the spots available have been allocated, so anyone who is not registered at this point will have to wait for next year. There will be no on-site registration. We regret that we could not accommodate everyone. The good news is that all of the microconferences, refereed talks, Kernel summit track, and Networking track will be recorded on video and made available as soon as possible after the conference. Anyone who could not make it to Lisbon this year will at least be able to catch up with what went on. Hopefully those who wanted to come will make it to a future LPC.

For those who are attending, we are just a few days away; you should have received an email with more details. Beyond that, the detailed schedule is available. There are also some tips on using the metro to get to the venue. As always, please send any questions or comments to “contact@linuxplumbersconf.org”.

September 04, 2019 09:31 PM

August 30, 2019

Pete Zaitcev: Docker Block Storage... say what again?

Found an odd job posting at the website of Rancher:

What you will be doing

Okay. Since they talk about consistency and replication together, this thing probably provides actual service, in addition to the necessary orchestration. Kind of the ill-fated Sheepdog. They may under-estimate the amount of work necesary, sure. Look no further than Ceph RBD. Remember how much work it took for a genius like Sage? But a certain arrogance is essential in a start-up, and Rancher only employs 150 people.

Also, nobody is dumb enough to write orchestration in Go, right? So this probably is not just a layer on top of Ceph or whatever.

Well, it's still possible that it's merely an in-house equivalent of OpenStack Cinder, and they want it in Go because they are a Go house and if you have a hammer everything looks like a nail.

Either way, here's the main question: what does block storage have to do with Docker?

Docker, as far as I know, is a container runtime. And containers do not consume block storage. They plug into a Linux kernel that presents POSIX to them, only namespaced. Granted, certain applications consume block storage through Linux, that is why we have O_DIRECT. But to roll out a whole service like this just for those rare appliations... I don't think so.

Why would anyone want block storage for (Docker) containers? Isn't it absurd? What am I missing and what is Rancher up to?

UPDATE:

The key to remember here is that while running containers aren't using block storage, Docker containers are distributed as disk images, and they get a their own root filesystem by default. Therefore, any time anyone adds a Docker container, they have to allocate a block device and dump the application image into it. So, yes, it is some kind of Docker Cinder they are trying to build.

See Red Hat docs about managing Docker block storage in Atomic Host (h/t penguin42).

August 30, 2019 05:52 PM

August 15, 2019

Pete Zaitcev: POST, PUT, and CRUD

Anyone who ever worked with object storage knows that PUT creates, GET reads, POST updates, and DELETE deletes. Naturally, right? POST is such a strange verb with oddball encodings that it's perfect to update, while GET and PUT are matching twins like read(2) and write(2). Imagine my surprise, then, when I found that the official definition of RESTful makes POST create objects and PUT update them. There is even a FAQ, which uses sophistry and appeals to the authority of RFCs in order to justify this.

So, in the world of RESTful solipcism, you would upload an object foo into a bucket buk by issuing "POST /buk?obj=foo" [1], while "PUT /buk/foo" applies to pre-existing resources. Although, they had to admit that RFC-2616 assumes that PUT creates.

All this goes to show, too much dogma is not good for you.

[1] It's worse, actually. They want you to do "POST /buk", and receive a resource ID, generated by the server, and use that ID to refer to the resource.

August 15, 2019 07:10 PM

August 14, 2019

Greg Kroah-Hartman: Patch workflow with mutt - 2019

Given that the main development workflow for most kernel maintainers is with email, I spend a lot of time in my email client. For the past few decades I have used (mutt), but every once in a while I look around to see if there is anything else out there that might work better.

One project that looks promising is (aerc) which was started by (Drew DeVault). It is a terminal-based email client written in Go, and relies on a lot of other go libraries to handle a lot of the “grungy” work in dealing with imap clients, email parsing, and other fun things when it comes to free-flow text parsing that emails require.

aerc isn’t in a usable state for me just yet, but Drew asked if I could document exactly how I use an email client for my day-to-day workflow to see what needs to be done to aerc to have me consider switching.

Note, this isn’t a criticism of mutt at all. I love the tool, and spend more time using that userspace program than any other. But as anyone who knows email clients, they all suck, it’s just that mutt sucks less than everything else (that’s literally their motto)

I did a (basic overview of how I apply patches to the stable kernel trees quite a few years ago) but my workflow has evolved over time, so instead of just writing a private email to Drew, I figured it was time to post something showing others just how the sausage really is made.

Anyway, my email workflow can be divided up into 3 different primary things that I do:

Given that all stable kernel patches need to already be in Linus’s kernel tree first, the workflow of the how to work with the stable tree is much different from the new patch workflow.

Basic email reading

All of my email ends up in either two “inboxes” on my local machine. One for everything that is sent directly to me (either with To: or Cc:) as well as a number of mailing lists that I ensure I read all messages that are sent to it because I am a maintainer of those subsystems (like (USB), or (stable)). The second inbox consists of other mailing lists that I do not read all messages of, but review as needed, and can be referenced when I need to look something up. Those mailing lists are the “big” linux-kernel mailing list to ensure I have a local copy to search from when I am offline (due to traveling), as well as other “minor” development mailing lists that I like to keep a copy locally like linux-pci, linux-fsdevel, and a few other smaller vger lists.

I get these maildir folders synced with the mail server using (mbsync) which works really well and is much faster than using (offlineimap), which I used for many many years ends up being really slow for when you do not live on the same continent as the mail server. (Luis’s) recent post of switching to mbsync finally pushed me to take the time to configure it all properly and I am glad that I did.

Let’s ignore my “lists” inbox, as that should be able to be read by any email client by just pointing it at it. I do this with a simple alias:

alias muttl='mutt -f ~/mail_linux/'

which allows me to type muttl at any command line to instantly bring it up:

What I spend most of the time in is my “main” mailbox, and that is in a local maildir that gets synced when needed in ~/mail/INBOX/. A simple mutt on the command line brings this up:

Yes, everything just ends up in one place, in handling my mail, I prune relentlessly. Everything ends up in one of 3 states for what I need to do next:

Everything that does not require a response, or I’ve already responded to it, gets deleted from the main INBOX at that point in time, or saved into an archive in case I need to refer back to it again (like mailing list messages).

That last state makes me save the message into one of two local maildirs, todo and stable. Everything in todo is a new patch that I need to review, comment on, or apply to a development tree. Everything in stable is something that has to do with patches that need to get applied to the stable kernel tree.

Side note, I have scripts that run frequently that email me any patches that need to be applied to the stable kernel trees, when they hit Linus’s tree. That way I can just live in my email client and have everything that needs to be applied to a stable release in one place.

I sweep my main INBOX ever few hours, and sort things out either quickly responding, deleting, archiving, or saving into the todo or stable directory. I don’t achieve a constant “inbox zero”, but if I only have 40 or so emails in there, I am doing well.

So, for this main workflow, I need an easy way to:

These are all tasks that I bet almost everyone needs to do all the time, so a tool like aerc should be able to do that easily.

A note about filtering. As everything comes into one inbox, it is easier to filter that mbox based on things so I can process everything at once.

As an example, I want to read all of the messages sent to the linux-usb mailing list right now, and not see anything else. To do that, in mutt, I press l (limit) which brings up a prompt for a filter to apply to the mbox. This ability to limit messages to one type of thing is really powerful and I use it in many different ways within mutt.

Here’s an example of me just viewing all of the messages that are sent to the linux-usb mailing list, and saving them off after I have read them:

This isn’t that complex, but it has to work quickly and well on mailboxes that are really really big. As an example, here’s me opening my “all lists” mbox and filtering on the linux-api mailing list messages that I have not read yet. It’s really fast as mutt caches lots of information about the mailbox and does not require reading all of the messages each time it starts up to generate its internal structures.

All messages that I want to save to the todo directory I can do with a two keystroke sequence, .t which saves the message there automatically

Again, that’s a binding I set up years ago, , jumps to the specific mbox, and . copies the message to that location.

Now you see why using mutt is not exactly obvious, those bindings are not part of the default configuration and everyone ends up creating their own custom key bindings for whatever they want to do. It takes a good amount of time to figure this out and set things up how you want, but once you are over that learning curve, you can do very complex things easily. Much like an editor (emacs, vim), you can configure them to do complex things easily, but getting to that level can take a lot of time and knowledge. It’s a tool, and if you are going to rely on it, you should spend the time to learn how to use your tools really well.

Hopefully aerc can get to this level of functionality soon. Odds are everyone else does something much like this, as my use-case is not unusual.

Now let’s get to the unusual use cases, the fun things:

Development Patch review and apply

When I decide it’s time to review and apply patches, I do so by subsystem (as I maintain a number of different ones). As all pending patches are in one big maildir, I filter the messages by the subsystem I care about at the moment, and save all of the messages out to a local mbox file that I call s (hey, naming is hard, it gets worse, just wait…)

So, in my linux/work/ local directory, I keep the development trees for different subsystems like usb, char-misc, driver-core, tty, and staging.

Let’s look at how I handle some staging patches.

First, I go into my ~/linux/work/staging/ directory, which I will stay in while doing all of this work. I open the todo mbox with a quick ,t pressed within mutt (a macro I picked from somewhere long ago, I don’t remember where…), and then filter all staging messages, and save them to a local mbox with the following keystrokes:

mutt
,t
l staging
T
s ../s

Yes, I could skip the l staging step, and just do T staging instead of T, but it’s nice to see what I’m going to save off first before doing so:

Now all of those messages are in a local mbox file that I can open with a single keystroke, ’s’ on the command line. That is an alias:

alias s='mutt -f ../s'

I then dig around in that mbox, sort patches by driver type to see everything for that driver at once by filtering on the name and then save those messages to another mbox called ‘s1’ (see, I told you the names got worse.)

s
l erofs
T
s ../s1

I have lots of local mbox files all “intuitively” named ‘s1’, ‘s2’, and ‘s3’. Of course I have aliases to open those files quickly:

alias s1='mutt -f ../s1'
alias s2='mutt -f ../s2'
alias s3='mutt -f ../s3'

I have a number of these mbox files as sometimes I need to filter even further by patch set, or other things, and saving them all to different mboxes makes things go faster.

So, all the erofs patches are in one mbox, let’s open it up and review them, and save the patches that look good enough to apply to another mbox:

Turns out that not all patches need to be dealt with right now (moving erofs out of the staging directory requires other people to review it, so I just save those messages back to the todo mbox:

Now I have a single patch that I want to apply, but I need to add some acks from the maintainers of erofs provided. I do this by editing the “raw” message directly from within mutt. I open the individual messages from the maintainers, cut their reviewed-by line, and then edit the original patch and add those lines to the patch:

Some kernel maintainers right now are screaming something like “Automate this!”, “Patchwork does this for you!”, “Are you crazy?” Yeah, this is one place that I need to work on, but the time involved to do this is not that much and it’s not common that others actually review patches for subsystems I maintain, unfortunately.

The ability to edit a single message directly within my email client is essential. I end up having to fix up changelog text, editing the subject line to be correct, fixing the mail headers to not do foolish things with text formats, and in some cases, editing the patch itself for when it is corrupted or needs to be fixed (I want a Linkedin skill badge for “can edit diff files by hand and have them still work”)

So one hard requirement I have is “editing a raw message from within the email client.” If an email client can not do this, it’s a non-starter for me, sorry.

So we now have a single patch that needs to be applied to the tree. I am already in the ~/linux/work/staging/ directory, and on the correct git branch for where this patch needs to go (how I handle branches and how patches move between them deserve a totally different blog post…)

I can apply this patch in one of two different ways, using git am -s ../s1 on the command line, piping the whole mbox into git and applying the patches directly, or I can apply them within mutt individually by using a macro.

When I have a lot of patches to apply, I just pipe the mbox file to git am -s as I’m comfortable with that, and it goes quick for multiple patches. It also works well as I have lots of different terminal windows open in the same directory when doing this and I can quickly toggle between them.

But we are talking about email clients at the moment, so here’s me applying a single patch to the local git tree:

All it took was hitting the L key. That key is set up as a macro in my mutt configuration file with a single line:

macro index L '| git am -s'\n

This macro pipes the output of the current message to git am -s.

The ability of mutt to pipe the current message (or messages) to external scripts is essential for my workflow in a number of different places. Not having to leave the email client but being able to run something else with that message, is a very powerful functionality, and again, a hard requirement for me.

So that’s it for applying development patches. It’s a bunch of the same tasks over and over:

Doing that all within the email program and being able to quickly get in, and out of the program, as well as do work directly from the email program, is key.

Of course I do a “test build and sometimes test boot and then push git trees and notify author that the patch is applied” set of steps when applying patches too, but those are outside of my email client workflow and happen in a separate terminal window.

Stable patch review and apply

The process of reviewing patches for the stable tree is much like the development patch process, but it differs in that I never use ‘git am’ for applying anything.

The stable kernel tree, while under development, is kept as a series of patches that need to be applied to the previous release. This series of patches is maintained by using a tool called (quilt). Quilt is very powerful and handles sets of patches that need to be applied on top of a moving base very easily. The tool was based on a crazy set of shell scripts written by Andrew Morton a long time ago, and is currently maintained by Jean Delvare and has been rewritten in perl to make them more maintainable. It handles thousands of patches easily and quickly and is used by many developers to handle kernel patches for distributions as well as other projects.

I highly recommend it as it allows you to reorder, drop, add in the middle of the series, and manipulate patches in all sorts of ways, as well as create new patches directly. I do this for the stable tree as lots of times we end up dropping patches from the middle of the series when reviewers say they should not be applied, adding new patches where needed as prerequisites of existing patches, and other changes that with git, would require lots of rebasing.

Rebasing a git does not work for when you have developers working “down” from your tree. We usually have the rule with kernel development that if you have a public tree, it never gets rebased otherwise no one can use it for development.

Anyway, the stable patches are kept in a quilt series in a repository that is kept under version control in git (complex, yeah, sorry.) That queue can always be found (here).

I do create a linux-stable-rc git tree that is constantly rebased based on the stable queue for those who run test systems that can not handle quilt patches. That tree is found (here) and should not ever be used by anyone for anything other than automated testing. See (this email for a bit more explanation of how these git trees should, and should not, be used.

With all that background information behind us, let’s look at how I take patches that are in Linus’s tree, and apply them to the current stable kernel queues:

First I open the stable mbox. Then I filter by everything that has upstream in the subject line. Then I filter again by alsa to only look at the alsa patches. I look at the individual patches, looking at the patch to verify that it really is something that should be applied to the stable tree and determine what order to apply the patches in based on the date of the original commit.

I then hit F to pipe the message to a script that looks up the Fixes: tag in the message to determine what stable tree, if any, the commit that this fix was contained in.

In this example, the patch only should go back to the 4.19 kernel tree, so when I apply it, I know to stop at that place and not go further.

To apply the patch, I hit A which is another macro that I define in my mutt configuration

macro index A |'~/linux/stable/apply_it_from_email'\n
macro pager A |'~/linux/stable/apply_it_from_email'\n

It is defined “twice” as you can have different key bindings when you are looking at mailbox’s index of all messages from when you are looking at the contents of a single message.

In both cases, I pipe the whole email message to my apply_it_from_email script.

That script digs through the message, finds the git commit id of the patch in Linus’s tree, then runs a different script that takes the commit id, exports the patch associated with that id, edits the message to add my signed-off-by to the patch as well as dropping me into my editor to make any needed tweaks that might be needed (sometimes files get renamed so I have to do that by hand, and it gives me one final change to review the patch in my editor which is usually easier than in the email client directly as I have better syntax highlighting and can search and review the text better.

If all goes well, I save the file and the script continues and applies the patch to a bunch of stable kernel trees, one after another, adding the patch to the quilt series for that specific kernel version. To do all of this I had to spawn a separate terminal window as mutt does fun things to standard input/output when piping messages to a script, and I couldn’t ever figure out how to do this all without doing the extra spawn process.

Here it is in action, as a video as (asciinema) can’t show multiple windows at the same time.

Once I have applied the patch, I save it away as I might need to refer to it again, and I move on to the next one.

This sounds like a lot of different steps, but I can process a lot of these relatively quickly. The patch review step is the slowest one here, as that of course can not be automated.

I later take those new patches that have been applied and run kernel build tests and other things before sending out emails saying they have been applied to the tree. But like with development patches, that happens outside of my email client workflow.

Bonus, sending email from the command line

In writing this up, I remembered that I do have some scripts that use mutt to send email out. I don’t normally use mutt for this for patch reviews, as I use other scripts for that (ones that eventually got turned into git send-email), so it’s not a hard requirement, but it is nice to be able to do a simple:

mutt -s "${subject}" "${address}" <  ${msg} >> error.log 2>&1

from within a script when needed.

Thunderbird also can do this, I have used:

thunderbird --compose "to='${address}',subject='${subject}',message=${msg}"

at times in the past when dealing with email servers that mutt can not connect to easily (i.e. gmail when using oauth tokens).

Summary of what I need from an email client

So, to summarize it all for Drew, here’s my list of requirements for me to be able to use an email client for kernel maintainership roles:

That’s what I use for kernel development.

Oh, I forgot:

Bonus things that I have grown to rely on when using mutt is:

If you have made it this far, and you aren’t writing an email client, that’s amazing, it must be a slow news day, which is a good thing. I hope this writeup helps others to show them how mutt can be used to handle developer workflows easily, and what is required of a good email client in order to be able to do all of this.

Hopefully other email clients can get to state where they too can do all of this. Competition is good and maybe aerc can get there someday.

August 14, 2019 12:37 PM

August 10, 2019

Pete Zaitcev: Comment to 'О цифровой экономике и глобальных проблемах человечества' by omega_hyperon

Я это всё каждый день слышу. Эти люди берут вполне определившуюся тенденцию к замедлению научно-технического прогресса, и говорят - мы лучше знаем, что человечеству нужно. Отберите деньги у недостойных, и дайте таким умным как я, и прогресс снова пойдёт. А заодно защитим природу! И всегда капитализм виноват.

О том, что цивилизация топчется на месте, спору нет. А вот пара вещей о которых этот гандон умалчивает.

Во-первых, если отнять деньги у Диснея и отдать их исследовательскому институту, то денег не будет. Вроде по-русски говорит, а такого простого урука из распада СССР не вынес. Американская наука разгромила советскую науку во времена НТР прежде всего потому, что капиталистическая экономика предоставила экономическую базу для этой науки, а социалистическая экономика была провальной.

Вообще, если сравнить бюджет Эппла и Диснея с бюджетом Housing and Urban Development и аналогичных учреждений, то там разница на 2 порядка. Если кто-то хочет дать науке больше денег, то нужно не грабить Дисней, а прекратить давать халявщикам бесплатное жильё. Замедление науки и капитализма идут рука об руку и вызваны государственной политикой, а не каким-то там биткойном.

Во-вторых, кто вообще верит этим шарлатанам? Нам забивали баки про детей в Африке десятилетиями, а за время ужасного голода в Эфиопии её население увеличилось с 38 миллионов до 75 миллионов. То же самое произошло с белыми медведями. Площадь лесов на планете растёт. Допустим в Бразилии срубили какие-то леса под паздбища... Но кто в это поверит?

Этот кризис экспертизы - не шутка. Боязнь вакцин создана не капитализмом и биткойном, а загниванием и распадом системы научных исследований в целом. Он не назвал институт, бюджет которого он сравнил с Диснеем, а вот интересно, сколько там бездельников среди сотрудников.

Коллапс науки отражается не только в том как публика утратила веру в учёних. Объективные показатели тоже просели. Подтверждаемость публикаций очень плохая, и идёт вниз. Тоже биткойн виноват?

В обшцем большая часть этого нытя мне видится крайне вредной. Если он не в состоянии диагностировать причины кризиса, предлогаемые решения ничего нам не дадут, и биодиверии не прибавится.

View the entire thread this comment is a part of

August 10, 2019 12:46 PM

August 02, 2019

Michael Kerrisk (manpages): man-pages-5.02 is released

I've released man-pages-5.02. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from 28 contributors. The release includes around 120 commits that change more than 50 pages.

The most notable of the changes in man-pages-5.02 is the following:


August 02, 2019 09:04 AM

July 25, 2019

Pete Zaitcev: Swift is 2 to 4 times faster than any competitor

Or so they say, at least for certain workloads.

In January of 2015 I led a project to evaluate and select a next-generation storage platform that would serve as the central storage (sometimes referred to as an active archive or tier 2) for all workflows. We identified the following features as being key to the success of the platform:

In hindsight, I wish we would have included two additional requirements:

We spent the next ~1.5 years evaluating the following systems:

SwiftStack was the only solution that literally checked every box on our list of desired features, but that’s not the only reason we selected it over the competition.

The top three reasons behind our selection of SwiftStack were as follows:

Note: While SwiftStack 1space was not a part of the SwiftStack platform at the time of our evaluation and purchase, it would have been an additional deciding factor in favor of SwiftStack if it had been.

Interesting. It should be noted that performance of Swift is a great match for some workloads, but not for others. In particluar, Swift is weak on small-file workloads, such as Gnocchi, which writes a ton of 16-byte objects again and again. The overhead is a killer there, and not just on the wire: Swift has to update its accounting databases each and every time a write is done, so that "swift stat" shows things like quotas. Swift is also not particularly good at HPC-style workloads, which benefit from a great bisectional bandwidth, because we transfer all user data through so-called "proxy" servers. Unlike e.g. Ceph, Swift keeps the cluster topology hidden from the client, while a Ceph client actually tracks the ring changes, placement groups and their leaders, etc.. But as we can see, once the object sizes start climbing and the number of clients increases, Swift rapidly approaches the wire speed.

I cannot help noticing that the architecture in question has a front-facing cache of pool (tier 1), which is what the ultimate clients see instead of Swift. Most of the time, Swift is selected for its ability to serve tens of thousands of clients simultaneously, but not in this case. Apparently, the end-user invented ProxyFS independently.

There's no mention of Red Hat selling Swift in the post. Either it was not part of the evaluation at all, or the author forgot about it for the passing of time. He did list a bunch of rather weird and obscure storage solutions though.

July 25, 2019 02:38 AM

July 18, 2019

Kees Cook: security things in Linux v5.2

Previously: v5.1.

Linux kernel v5.2 was released last week! Here are some security-related things I found interesting:

page allocator freelist randomization
While the SLUB and SLAB allocator freelists have been randomized for a while now, the overarching page allocator itself wasn’t. This meant that anything doing allocation outside of the kmem_cache/kmalloc() would have deterministic placement in memory. This is bad both for security and for some cache management cases. Dan Williams implemented this randomization under CONFIG_SHUFFLE_PAGE_ALLOCATOR now, which provides additional uncertainty to memory layouts, though at a rather low granularity of 4MB (see SHUFFLE_ORDER). Also note that this feature needs to be enabled at boot time with page_alloc.shuffle=1 unless you have direct-mapped memory-side-cache (you can check the state at /sys/module/page_alloc/parameters/shuffle).

stack variable initialization with Clang
Alexander Potapenko added support via CONFIG_INIT_STACK_ALL for Clang’s -ftrivial-auto-var-init=pattern option that enables automatic initialization of stack variables. This provides even greater coverage than the prior GCC plugin for stack variable initialization, as Clang’s implementation also covers variables not passed by reference. (In theory, the kernel build should still warn about these instances, but even if they exist, Clang will initialize them.) Another notable difference between the GCC plugins and Clang’s implementation is that Clang initializes with a repeating 0xAA byte pattern, rather than zero. (Though this changes under certain situations, like for 32-bit pointers which are initialized with 0x000000AA.) As with the GCC plugin, the benefit is that the entire class of uninitialized stack variable flaws goes away.

Kernel Userspace Access Prevention on powerpc
Like SMAP on x86 and PAN on ARM, Michael Ellerman and Russell Currey have landed support for disallowing access to userspace without explicit markings in the kernel (KUAP) on Power9 and later PPC CPUs under CONFIG_PPC_RADIX_MMU=y (which is the default). This is the continuation of the execute protection (KUEP) in v4.10. Now if an attacker tries to trick the kernel into any kind of unexpected access from userspace (not just executing code), the kernel will fault.

Microarchitectural Data Sampling mitigations on x86
Another set of cache memory side-channel attacks came to light, and were consolidated together under the name Microarchitectural Data Sampling (MDS). MDS is weaker than other cache side-channels (less control over target address), but memory contents can still be exposed. Much like L1TF, when one’s threat model includes untrusted code running under Symmetric Multi Threading (SMT: more logical cores than physical cores), the only full mitigation is to disable hyperthreading (boot with “nosmt“). For all the other variations of the MDS family, Andi Kleen (and others) implemented various flushing mechanisms to avoid cache leakage.

unprivileged userfaultfd sysctl knob
Both FUSE and userfaultfd provide attackers with a way to stall a kernel thread in the middle of memory accesses from userspace by initiating an access on an unmapped page. While FUSE is usually behind some kind of access controls, userfaultfd hadn’t been. To avoid various heap grooming and heap spraying techniques for exploiting Use-after-Free flaws, Peter Xu added the new “vm.unprivileged_userfaultfd” sysctl knob to disallow unprivileged access to the userfaultfd syscall.

temporary mm for text poking on x86
The kernel regularly performs self-modification with things like text_poke() (during stuff like alternatives, ftrace, etc). Before, this was done with fixed mappings (“fixmap”) where a specific fixed address at the high end of memory was used to map physical pages as needed. However, this resulted in some temporal risks: other CPUs could write to the fixmap, or there might be stale TLB entries on removal that other CPUs might still be able to write through to change the target contents. Instead, Nadav Amit has created a separate memory map for kernel text writes, as if the kernel is trying to make writes to userspace. This mapping ends up staying local to the current CPU, and the poking address is randomized, unlike the old fixmap.

ongoing: implicit fall-through removal
Gustavo A. R. Silva is nearly done with marking (and fixing) all the implicit fall-through cases in the kernel. Based on the pull request from Gustavo, it looks very much like v5.3 will see -Wimplicit-fallthrough added to the global build flags and then this class of bug should stay extinct in the kernel.

CLONE_PIDFD added
Christian Brauner added the new CLONE_PIDFD flag to the clone() system call, which complements the pidfd work in v5.1 so that programs can now gain a handle for a process ID right at fork() (actually clone()) time, instead of needing to get the handle from /proc after process creation. With signals and forking now enabled, the next major piece (already in linux-next) will be adding P_PIDFD to the waitid() system call, and common process management can be done entirely with pidfd.

Edit: added CLONE_PIDFD notes, as reminded by Christian Brauner. :)

That’s it for now; let me know if you think I should add anything here. We’re almost to -rc1 for v5.3!

© 2019, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

July 18, 2019 12:07 AM

July 17, 2019

Linux Plumbers Conference: System Boot and Security Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the System Boot and Security Microconference has been accepted into the 2019 Linux Plumbers Conference! Computer-system security is a topic that has gotten a lot of serious attention over the years, but there has not been anywhere near as much attention paid to the system firmware. But the firmware is also a target for those looking to wreak havoc on our systems. Firmware is now being developed with security in mind, but provides incomplete solutions. This microconference will focus on the security of the system especially from the time the system is powered on.

Expected topics for this year include:

Come and join us in the discussion of keeping your system secure even at boot up.

We hope to see you there!

July 17, 2019 04:57 PM

July 10, 2019

Linux Plumbers Conference: Power Management and Thermal Control Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Power Management and Thermal Control Microconference has been accepted into the 2019 Linux Plumbers Conference! Power management and thermal control are important areas in the Linux ecosystem to help improve the environment of the planet. In recent years, computer systems have been becoming more and more complex and thermally challenged at the same time and the energy efficiency expectations regarding them have been growing. This trend is likely to continue in the foreseeable future and despite the progress made in the power-management and thermal-control problem space since the Linux Plumbers Conference last year. That progress includes, but is not limited to, the merging of the energy-aware scheduling patch series and CPU idle-time management improvements; there will be more work to do in those areas. This gathering will focus on continuing to have Linux meet the power-management and thermal-control challenge.

Topics for this year include:

Come and join us in the discussion of how to extend the battery life of your laptop while keeping it cool.

We hope to see you there!

July 10, 2019 10:29 PM

July 09, 2019

Linux Plumbers Conference: Android Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Android Microconference has been accepted into the 2019 Linux Plumbers Conference! Android has a long history at Linux Plumbers and has continually made progress as a direct result of these meetings. This year’s focus will be a fairly ambitious goal to create a Generic Kernel Image (GKI) (or one kernel to rule them all!). Having a GKI will allow silicon vendors to be independent of the Linux kernel running on the device. As such, kernels could be easily upgraded without requiring any rework of the initial hardware porting efforts. This microconference will also address areas that have been discussed in the past.

The proposed topics include:

Come and join us in the discussion of improving what is arguably the most popular operating system in the world!

We hope to see you there!

July 09, 2019 11:29 PM

Matthew Garrett: Bug bounties and NDAs are an option, not the standard

Zoom had a vulnerability that allowed users on MacOS to be connected to a video conference with their webcam active simply by visiting an appropriately crafted page. Zoom's response has largely been to argue that:

a) There's a setting you can toggle to disable the webcam being on by default, so this isn't a big deal,
b) When Safari added a security feature requiring that users explicitly agree to launch Zoom, this created a poor user experience and so they were justified in working around this (and so introducing the vulnerability), and,
c) The submitter asked whether Zoom would pay them for disclosing the bug, and when Zoom said they'd only do so if the submitter signed an NDA, they declined.

(a) and (b) are clearly ludicrous arguments, but (c) is the interesting one. Zoom go on to mention that they disagreed with the severity of the issue, and in the end decided not to change how their software worked. If the submitter had agreed to the terms of the NDA, then Zoom's decision that this was a low severity issue would have led to them being given a small amount of money and never being allowed to talk about the vulnerability. Since Zoom apparently have no intention of fixing it, we'd presumably never have heard about it. Users would have been less informed, and the world would have been a less secure place.

The point of bug bounties is to provide people with an additional incentive to disclose security issues to companies. But what incentive are they offering? Well, that depends on who you are. For many people, the amount of money offered by bug bounty programs is meaningful, and agreeing to sign an NDA is worth it. For others, the ability to publicly talk about the issue is worth more than whatever the bounty may award - being able to give a presentation on the vulnerability at a high profile conference may be enough to get you a significantly better paying job. Others may be unwilling to sign an NDA on principle, refusing to trust that the company will ever disclose the issue or fix the vulnerability. And finally there are people who can't sign such an NDA - they may have discovered the issue on work time, and employer policies may prohibit them doing so.

Zoom are correct that it's not unusual for bug bounty programs to require NDAs. But when they talk about this being an industry standard, they come awfully close to suggesting that the submitter did something unusual or unreasonable in rejecting their bounty terms. When someone lets you know about a vulnerability, they're giving you an opportunity to have the issue fixed before the public knows about it. They've done something they didn't need to do - they could have just publicly disclosed it immediately, causing significant damage to your reputation and potentially putting your customers at risk. They could potentially have sold the information to a third party. But they didn't - they came to you first. If you want to offer them money in order to encourage them (and others) to do the same in future, then that's great. If you want to tie strings to that money, that's a choice you can make - but there's no reason for them to agree to those strings, and if they choose not to then you don't get to complain about that afterwards. And if they make it clear at the time of submission that they intend to publicly disclose the issue after 90 days, then they're acting in accordance with widely accepted norms. If you're not able to fix an issue within 90 days, that's very much your problem.

If your bug bounty requires people sign an NDA, you should think about why. If it's so you can control disclosure and delay things beyond 90 days (and potentially never disclose at all), look at whether the amount of money you're offering for that is anywhere near commensurate with the value the submitter could otherwise gain from the information and compare that to the reputational damage you'll take from people deciding that it's not worth it and just disclosing unilaterally. And, seriously, never ask for an NDA before you're committing to a specific $ amount - it's never reasonable to ask that someone sign away their rights without knowing exactly what they're getting in return.

tl;dr - a bug bounty should only be one component of your vulnerability reporting process. You need to be prepared for people to decline any restrictions you wish to place on them, and you need to be prepared for them to disclose on the date they initially proposed. If they give you 90 days, that's entirely within industry norms. Remember that a bargain is being struck here - you offering money isn't being generous, it's you attempting to provide an incentive for people to help you improve your security. If you're asking people to give up more than you're offering in return, don't be surprised if they say no.

comment count unavailable comments

July 09, 2019 09:15 PM

Linux Plumbers Conference: Update on LPC 2019 registration waiting list

Here is an update regarding the registration situation for LPC2019.

The considerable interest for participation this year meant that the conference sold out earlier than ever before.

Instead of a small release of late-registration spots, the LPC planning committee has decided to run a waiting list, which will be used as the exclusive method for additional registrations. The planning committee will reach out to individuals on the waiting list and inviting them to register at the regular rate of $550, as spots become available.

With the majority of the Call for Proposals (CfP) still open, it is not yet possible to release passes. The planning committee and microconferences leads are working together to allocate the passes earmarked for microconferences. The Networking Summit and Kernel Summit speakers are yet to be confirmed also.

The planning committee understands that many of those who added themselves to the waiting list wish to find out soon whether they will be issued a pass. We anticipate the first passes to be released on July 22nd at the earliest.

Please follow us on social media, or here on this blog for further updates.

July 09, 2019 01:36 AM

July 08, 2019

Linux Plumbers Conference: VFIO/IOMMU/PCI Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the VFIO/IOMMU/PCI Microconference has been accepted into the 2019 Linux Plumbers Conference!

The PCI interconnect specification and the devices implementing it are incorporating more and more features aimed at high performance systems. This requires the kernel to coordinate the PCI devices, the IOMMUs they are connected to and the VFIO layer used to manage them (for user space access and device pass-through) so that users (and virtual machines) can use them effectively. The kernel interfaces to control PCI devices have to be designed in-sync for all three subsystems, which implies that there are lots of intersections in the design of kernel control paths for VFIO/IOMMU/PCI requiring kernel code design discussions involving the three subsystems at once.

A successful VFIO/IOMMU/PCI Microconference was held at Linux Plumbers in 2017, where:

This year, the microconference will follow up on the previous microconference agendas and focus on ongoing patches review/design aimed at VFIO/IOMMU/PCI subsystems.

Topics for this year include:

Come and join us in the discussion in helping Linux keep up with the new features being added to the PCI interconnect specification.

We hope to see you there!

July 08, 2019 05:56 PM

July 07, 2019

Matthew Garrett: Creating hardware where no hardware exists

The laptop industry was still in its infancy back in 1990, but it still faced a core problem that we do today - power and thermal management are hard, but also critical to a good user experience (and potentially to the lifespan of the hardware). This is in the days where DOS and Windows had no memory protection, so handling these problems at the OS level would have been an invitation for someone to overwrite your management code and potentially kill your laptop. The safe option was pushing all of this out to an external management controller of some sort, but vendors in the 90s were the same as vendors now and would do basically anything to avoid having to drop an extra chip on the board. Thankfully(?), Intel had a solution.

The 386SL was released in October 1990 as a low-powered mobile-optimised version of the 386. Critically, it included a feature that let vendors ensure that their power management code could run without OS interference. A small window of RAM was hidden behind the VGA memory[1] and the CPU configured so that various events would cause the CPU to stop executing the OS and jump to this protected region. It could then do whatever power or thermal management tasks were necessary and return control to the OS, which would be none the wiser. Intel called this System Management Mode, and we've never really recovered.

Step forward to the late 90s. USB is now a thing, but even the operating systems that support USB usually don't in their installers (and plenty of operating systems still didn't have USB drivers). The industry needed a transition path, and System Management Mode was there for them. By configuring the chipset to generate a System Management Interrupt (or SMI) whenever the OS tried to access the PS/2 keyboard controller, the CPU could then trap into some SMM code that knew how to talk to USB, figure out what was going on with the USB keyboard, fake up the results and pass them back to the OS. As far as the OS was concerned, it was talking to a normal keyboard controller - but in reality, the "hardware" it was talking to was entirely implemented in software on the CPU.

Since then we've seen even more stuff get crammed into SMM, which is annoying because in general it's much harder for an OS to do interesting things with hardware if the CPU occasionally stops in order to run invisible code to touch hardware resources you were planning on using, and that's even ignoring the fact that operating systems in general don't really appreciate the entire world stopping and then restarting some time later without any notification. So, overall, SMM is a pain for OS vendors.

Change of topic. When Apple moved to x86 CPUs in the mid 2000s, they faced a problem. Their hardware was basically now just a PC, and that meant people were going to try to run their OS on random PC hardware. For various reasons this was unappealing, and so Apple took advantage of the one significant difference between their platforms and generic PCs. x86 Macs have a component called the System Management Controller that (ironically) seems to do a bunch of the stuff that the 386SL was designed to do on the CPU. It runs the fans, it reports hardware information, it controls the keyboard backlight, it does all kinds of things. So Apple embedded a string in the SMC, and the OS tries to read it on boot. If it fails, so does boot[2]. Qemu has a driver that emulates enough of the SMC that you can provide that string on the command line and boot OS X in qemu, something that's documented further here.

What does this have to do with SMM? It turns out that you can configure x86 chipsets to trap into SMM on arbitrary IO port ranges, and older Macs had SMCs in IO port space[3]. After some fighting with Intel documentation[4] I had Coreboot's SMI handler responding to writes to an arbitrary IO port range. With some more fighting I was able to fake up responses to reads as well. And then I took qemu's SMC emulation driver and merged it into Coreboot's SMM code. Now, accesses to the IO port range that the SMC occupies on real hardware generate SMIs, trap into SMM on the CPU, run the emulation code, handle writes, fake up responses to reads and return control to the OS. From the OS's perspective, this is entirely invisible[5]. We've created hardware where none existed.

The tree where I'm working on this is here, and I'll see if it's possible to clean this up in a reasonable way to get it merged into mainline Coreboot. Note that this only handles the SMC - actually booting OS X involves a lot more, but that's something for another time.

[1] If the OS attempts to access this range, the chipset directs it to the video card instead of to actual RAM.
[2] It's actually more complicated than that - see here for more.
[3] IO port space is a weird x86 feature where there's an entire separate IO bus that isn't part of the memory map and which requires different instructions to access. It's low performance but also extremely simple, so hardware that has no performance requirements is often implemented using it.
[4] Some current Intel hardware has two sets of registers defined for setting up which IO ports should trap into SMM. I can't find anything that documents what the relationship between them is, but if you program the obvious ones nothing happens and if you program the ones that are hidden in the section about LPC decoding ranges things suddenly start working.
[5] Eh technically a sufficiently enthusiastic OS could notice that the time it took for the access to occur didn't match what it should on real hardware, or could look at the CPU's count of the number of SMIs that have occurred and correlate that with accesses, but good enough

comment count unavailable comments

July 07, 2019 08:15 PM

Linux Plumbers Conference: Scheduler Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Scheduler Microconference has been accepted into the 2019 Linux Plumbers Conference! The scheduler determines what runs on the CPU at any given time. The lag of your desktop is affected by the scheduler, for example. There are a few different scheduling classes for a user to choose from, such as the default class (SCHED_OTHER) or a real-time class (SCHED_FIFO, SCHED_RT and SCHED_DEADLINE). The deadline scheduler is the newest and allows the user to control the amount of bandwidth received by a task or group of tasks. With cloud computing becoming popular these days, controlling bandwidth of containers or virtual machines is becoming more important. The Real-Time patch is also destined to become mainline, which will add more strain on the scheduling of tasks to make sure that real-time tasks make their deadlines (although, this Microconference will focus on non real-time aspects of the scheduler. Please defer real-time topics to the Real-time Microconference). This requires verification techniques to ensure the scheduler is properly designed.

Topics for this year include:

Come and join us in the discussion of controlling what tasks get to run on your machine and when.

We hope to see you there!

July 07, 2019 02:51 PM

July 03, 2019

Linux Plumbers Conference: RDMA Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the RDMA Microconference has been accepted into the 2019 Linux Plumbers Conference! RDMA has been a microconference at Plumbers for the last three years and will be continuing its productive work for a fourth year. The RDMA meetings at the previous Plumbers have been critical in getting improvements to the RDMA subsystem merged into mainline. These include a new user API, container support, testability/syzkaller, system bootup, Soft iWarp, and more. There are still difficult open issues that need to be resolved, and this year’s Plumbers RDMA Microconfernence is sure to come up with answers to these tough problems.

Topics for this year include:

And new developing areas of interest:

Come and join us in the discussion of improving Linux’s ability to access direct memory across high-speed networks.

We hope to see you there!

July 03, 2019 10:54 PM

July 02, 2019

Linux Plumbers Conference: Announcing the LPC 2019 registration waiting list

 

The current pool of registrations for the 2019 Linux Plumbers Conference has sold out.

Those not yet registered who wish to attend should fill out the form here to get on the waiting list.

As registration spots open up, the Plumbers organizing committee will allocate them to those  on the waiting list with priority given to those who will be participating in microconferences and BoFs.

 

 

July 02, 2019 07:22 PM

Linux Plumbers Conference: Preliminary schedule for LPC 2019 has been published

The LPC committee is pleased to announce the preliminary schedule for the 2019 Linux Plumbers Conference.

The vast majority of the LPC refereed track talks have been accepted and are listed there. The same is true for microconferences. While there are a few talks and microconferences to be announced, you will find the current overview LPC schedule here. The LPC refereed track talks can be seen here.

The call for proposals (CfP) is still open for the Kernel Summit, Networking Summit, BOFs and topics for accepted microconferences.

As new microconferences, talks, and BOFs are accepted, they will be published to the schedule.

July 02, 2019 04:48 PM

June 30, 2019

Matthew Garrett: Which smart bulbs should you buy (from a security perspective)

People keep asking me which smart bulbs they should buy. It's a great question! As someone who has, for some reason, ended up spending a bunch of time reverse engineering various types of lightbulb, I'm probably a reasonable person to ask. So. There are four primary communications mechanisms for bulbs: wifi, bluetooth, zigbee and zwave. There's basically zero compelling reasons to care about zwave, so I'm not going to.

Wifi


Advantages: Doesn't need an additional hub - you can just put the bulbs wherever. The bulbs can connect out to a cloud service, so you can control them even if you're not on the same network.
Disadvantages: Only works if you have wifi coverage, each bulb has to have wifi hardware and be configured appropriately.
Which should you get: If you search Amazon for "wifi bulb" you'll get a whole bunch of cheap bulbs. Don't buy any of them. They're mostly based on a custom protocol from Zengge and they're shit. Colour reproduction is bad, there's no good way to use the colour LEDs and the white LEDs simultaneously, and if you use any of the vendor apps they'll proxy your device control through a remote server with terrible authentication mechanisms. Just don't. The ones that aren't Zengge are generally based on the Tuya platform, whose security model is to have keys embedded in some incredibly obfuscated code and hope that nobody can find them. TP-Link make some reasonably competent bulbs but also use a weird custom protocol with hand-rolled security. Eufy are fine but again there's weird custom security. Lifx are the best bulbs, but have zero security on the local network - anyone on your wifi can control the bulbs. If that's something you care about then they're a bad choice, but also if that's something you care about maybe just don't let people you don't trust use your wifi.
Conclusion: If you have to use wifi, go with lifx. Their security is not meaningfully worse than anything else on the market (and they're better than many), and they're better bulbs. But you probably shouldn't go with wifi.

Bluetooth


Advantages: Doesn't need an additional hub. Doesn't need wifi coverage. Doesn't connect to the internet, so remote attack is unlikely.
Disadvantages: Only one control device at a time can connect to a bulb, so harder to share. Control device needs to be in Bluetooth range of the bulb. Doesn't connect to the internet, so you can't control your bulbs remotely.
Which should you get: Again, most Bluetooth bulbs you'll find on Amazon are shit. There's a whole bunch of weird custom protocols and the quality of the bulbs is just bad. If you're going to go with anything, go with the C by GE bulbs. Their protocol is still some AES-encrypted custom binary thing, but they use a Bluetooth controller from Telink that supports a mesh network protocol. This means that you can talk to any bulb in your network and still send commands to other bulbs - the dual advantages here are that you can communicate with bulbs that are outside the range of your control device and also that you can have as many control devices as you have bulbs. If you've bought into the Google Home ecosystem, you can associate them directly with a Home and use Google Assistant to control them remotely. GE also sell a wifi bridge - I have one, but haven't had time to review it yet, so make no assertions around its competence. The colour bulbs are also disappointing, with much dimmer colour output than white output.

Zigbee


Advantages: Zigbee is a mesh protocol, so bulbs can forward messages to each other. The bulbs are also pretty cheap. Zigbee is a standard, so you can obtain bulbs from several vendors that will then interoperate - unfortunately there are actually two separate standards for Zigbee bulbs, and you'll sometimes find yourself with incompatibility issues there.
Disadvantages: Your phone doesn't have a Zigbee radio, so you can't communicate with the bulbs directly. You'll need a hub of some sort to bridge between IP and Zigbee. The ecosystem is kind of a mess, and you may have weird incompatibilities.
Which should you get: Pretty much every vendor that produces Zigbee bulbs also produces a hub for them. Don't get the Sengled hub - anyone on the local network can perform arbitrary unauthenticated command execution on it. I've previously recommended the Ikea Tradfri, which at the time only had local control. They've since added remote control support, and I haven't investigated that in detail. But overall, I'd go with the Philips Hue. Their colour bulbs are simply the best on the market, and their security story seems solid - performing a factory reset on the hub generates a new keypair, and adding local control users requires a physical button press on the hub to allow pairing. Using the Philips hub doesn't tie you into only using Philips bulbs, but right now the Philips bulbs tend to be as cheap (or cheaper) than anything else.

But what about


If you're into tying together all kinds of home automation stuff, then either go with Smartthings or roll your own with Home Assistant. Both are definitely more effort if you only want lighting.

My priority is software freedom


Excellent! There are various bulbs that can run the Espurna or AiLight firmwares, but you'll have to deal with flashing them yourself. You can tie that into Home Assistant and have a completely free stack. If you're ok with your bulbs being proprietary, Home Assistant can speak to most types of bulb without an additional hub (you'll need a supported Zigbee USB stick to control Zigbee bulbs), and will support the C by GE ones as soon as I figure out why my Bluetooth transmissions stop working every so often.

Conclusion


Outside niche cases, just buy a Hue. Philips have done a genuinely good job. Don't buy cheap wifi bulbs. Don't buy a Sengled hub.

(Disclaimer: I mentioned a Google product above. I am a Google employee, but do not work on anything related to Home.)

comment count unavailable comments

June 30, 2019 08:10 PM

June 26, 2019

Linux Plumbers Conference: Databases Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Databases Microconference has been accepted into the 2019 Linux Plumbers Conference! Linux plumbing is heavily important to those who implement databases and their users who expect fast and durable data handling.

Durability is a promise never to lose data after advising a user of a successful update, even in the face of power loss. It requires a full-stack solution from the application to the database, then to Linux (filesystem, VFS, block interface, driver), and on to the hardware.

Fast means getting a database user a response in less that tens of milliseconds, which requires that Linux filesystems, memory and CPU management, and the networking stack do everything with the utmost effectiveness and efficiency.

For all Linux users, there is a benefit in having database developers interact with system developers; it will ensure that the promise of durability and speed are both kept as newer hardware technologies emerge, existing CPU/RAM resources grow, and while data stored grows even faster.

Topics for this Microconference include:

Come and join us in the discussion about making databases run smoother and faster.

We hope to see you there!

June 26, 2019 03:14 PM

June 25, 2019

James Morris: Linux Security Summit North America 2019: Schedule Published

The schedule for the 2019 Linux Security Summit North America (LSS-NA) is published.

This year, there are some changes to the format of LSS-NA. The summit runs for three days instead of two, which allows us to relax the schedule somewhat while also adding new session types.  In addition to refereed talks, short topics, BoF sessions, and subsystem updates, there are now also tutorials (one each day), unconference sessions, and lightning talks.

The tutorial sessions are:

These tutorials will be 90 minutes in length, and they’ll run in parallel with unconference sessions on the first two days (when the space is available at the venue).

The refereed presentations and short topics cover a range of Linux security topics including platform boot security, integrity, container security, kernel self protection, fuzzing, and eBPF+LSM.

Some of the talks I’m personally excited about include:

The schedule last year was pretty crammed, so with the addition of the third day we’ve been able to avoid starting early, and we’ve also added five minute transitions between talks. We’re hoping to maximize collaboration via the more relaxed schedule and the addition of more types of sessions (unconference, tutorials, lightning talks).  This is not a conference for simply consuming talks, but to also participate and to get things done (or started).

Thank you to all who submitted proposals.  As usual, we had many more submissions than can be accommodated in the available time.

Also thanks to the program committee, who spent considerable time reviewing and discussing proposals, and working out the details of the schedule. The committee for 2019 is:

  • James Morris (Microsoft)
  • Serge Hallyn (Cisco)
  • Paul Moore (Cisco)
  • Stephen Smalley (NSA)
  • Elena Reshetova (Intel)
  • John Johnansen (Canonical)
  • Kees Cook (Google)
  • Casey Schaufler (Intel)
  • Mimi Zohar (IBM)
  • David A. Wheeler (Institute for Defense Analyses)

And of course many thanks to the event folk at Linux Foundation, who handle all of the logistics of the event.

LSS-NA will be held in San Diego, CA on August 19-21. To register, click here. Or you can register for the co-located Open Source Summit and add LSS-NA.

 

June 25, 2019 08:43 PM

June 19, 2019

Linux Plumbers Conference: Real-Time Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Real-Time Microconference has been accepted into the 2019 Linux Plumbers Conference! The PREEMPT_RT patch set (aka “The Real-Time Patch”) was created in 2004 in the effort to make Linux into a hard real-time designed operating system. Over the years much of the RT patch has made it into mainline Linux, which includes: mutexes, lockdep, high-resolution timers, Ftrace, RCU_PREEMPT, priority inheritance, threaded interrupts and much more. There’s just a little left to get RT fully into mainline, and the light at the end of the tunnel is finally in view. It is expected that the RT patch will be in mainline within a year, which changes the topics of discussion. Once it is in Linus’s tree, a whole new set of issues must be handled. The focus on this year’s Plumbers events will include:

Come and join us in the discussion of making the LWN prediction of RT coming into mainline “this year” a reality!

We hope to see you there!

June 19, 2019 10:55 PM

Linux Plumbers Conference: Testing and Fuzzing Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Testing and Fuzzing Microconference has been accepted into the 2019 Linux Plumbers Conference! Testing and fuzzing are crucial to the stability that the Linux kernel demands.

Last year’s microconference brought about a number of discussions; for example, syzkaller evolved as syzbot, which keeps track of fuzzing efforts and the resulting fixes. The closing ceremony pointed out all the work that still has to be done: There are a number of overlapping efforts, and those need to be consolidated. The use of KASAN should be increased. Where is fuzzing going next? With real-time moving forward from “if” to “when” in the mainline, how does RT test coverage increase? The unit-testing frameworks may need some unification. Also, KernelCI will be announced as an LF project this time around. Stay around for the KernelCI hackathon after the conference to help further those efforts.

Come and join us for the discussion!

We hope to see you there!

June 19, 2019 02:29 AM

June 17, 2019

Linux Plumbers Conference: Toolchains Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Toolchains Microconference has been accepted into the 2019 Linux Plumbers Conference! The Linux kernel may
be one of the most powerful systems around, but it takes a powerful toolchain to make that happen. The kernel takes advantage of any feature
that the toolchains provide, and collaboration between the kernel and toolchain developers will make that much more seamless.

Toolchains topics will include:

Come and join us in the discussion of what makes it possible to build the most robust and flexible kernel in the world!

We hope to see you there!

June 17, 2019 06:10 PM

June 15, 2019

Greg Kroah-Hartman: Linux stable tree mirror at github

As everyone seems to like to put kernel trees up on github for random projects (based on the crazy notifications I get all the time), I figured it was time to put up a semi-official mirror of all of the stable kernel releases on github.com

It can be found at: https://github.com/gregkh/linux and I will try to keep it up to date with the real source of all kernel stable releases at https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/

It differs from Linus’s tree at: https://github.com/torvalds/linux in that it contains all of the different stable tree branches and stable releases and tags, which many devices end up building on top of.

So, mirror away!

Also note, this is a read-only mirror, any pull requests created on it will be gleefully ignored, just like happens on Linus’s github mirror.

If people think this is needed on any other git hosting site, just let me know and I will be glad to push to other places as well.

This notification was also cross-posted on the new http://people.kernel.org/ site, go follow that for more kernel developer stuff.

June 15, 2019 08:10 PM

June 14, 2019

Linux Plumbers Conference: Open Printing Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Open Printing Microconference has been accepted into the 2019 Linux Plumbers Conference! In today’s world much is done online. But getting a hardcopy is still very much needed, even today. Then there’s the case of having a hardcopy and wanting to scan it to make it digital. All of this is needed to be functional on Linux to keep Linux-based and open source operating systems relevant. Also, with the progress in technology, the usage of modern printers and scanners is becoming simple. The driverless concept has made printing and scanning easier and gets the job done with some simple clicks without requiring the user to install any kind of driver software. The Open Printing organization has been tasked with getting this job done. This Microconference will focus on what needs to be accomplished to keep Linux and open source operating systems a leader in today’s market.

Topics for this Microconference include:

Come and join us in the discussion of keeping your printers working.

We hope to see you there!

June 14, 2019 03:18 PM

June 13, 2019

Linux Plumbers Conference: Live Patching Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Live Patching Microconference has been accepted into the 2019 Linux Plumbers Conference! There are some workloads that require 100% uptime so rebooting for maintenance is not an option. But this can make the system insecure as new security vulnerabilities may have been discovered in the running kernel. Live kernel patching is a technique to update the kernel without taking down the machine. As one can imagine, patching a running kernel is far from trivial. Although it is being used in production today[1][2], there are still many issues that need to be solved.

These include:

Come and join us in the discussion about changing your running kernel without having to take it down!

We hope to see you there!

June 13, 2019 08:19 PM

June 12, 2019

Linux Plumbers Conference: You, Me and IoT Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the You, Me and IoT Microconference has been accepted into the 2019 Linux Plumbers Conference! IoT is becoming an
integral part of our daily lives, controlling such devices as on/off switches, temperature controls, door and window sensors and so much more. But the technology itself requires a lot of infrastructure and communication frameworks such as Zigbee, OpenHAB and 6LoWPAN. Open source Real-Time embedded operating systems also come into play like Zephyr. A completely open source framework implementation is Greybus that already made it into staging. Discussions will be around Greybus:

– Device management
– Abstracted devices
– Management of Unique IDs
– Network management
– Userspace utilities
– Network Authentication
– Encryption
– Firmware updates
– And more

Come join us and participate in the discussion on what keeps the Internet of Things together.

We hope to see you there!

June 12, 2019 02:00 PM

June 07, 2019

Pete Zaitcev: PostgreSQL and upgrades

As mentioned previously, I run a personal Fediverse instance with Pleroma, which uses Postgres. On Fedora, of course. So, a week ago, I went to do the usual "dnf distro-sync --releasever=30". And then, Postgres fails to start, because the database uses the previous format, 10, and the packages in F30 require format 11. Apparently, I was supposed to dump the database with pg_dumpall, upgrade, then restore. But now that I have binaries that refuse to read the old format, dumping is impossible. Wow.

A little web searching found an upgrader that works across formats (dnf install postgresql-upgrade; postgresql-setup --upgrade). But that one also copies the database, like a dump-restore procedure would. What if the database is too large for this? Am I the only one who finds these practices unacceptable?

Postgres was supposed to be a solid big brother to a boisterous but unreliable upstart MySQL, kind of like Postfix and Exim. But this is just such an absurd fault, it makes me think that I'm missing something essential.

UPDATE: Kaz commented that a form of -compat is conventional:

When I've upgraded in the past, Ubuntu has always just installed the new version of postgres alongside the old one, to allow me to manually export and reimport at my leisure, then remove the old version afterward. Because both are installed, you could pipe the output of one dumpall to the psql command on the other database and the size doesn't matter. The apps continue to point at their old version until I redirect them.

Yeah, as much as I can tell, Fedora does not package anything like that.

June 07, 2019 02:04 PM

June 04, 2019

Linux Plumbers Conference: Containers and Checkpoint/Restore MC Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Containers and Checkpoint/Restore Microconference has been accepted into the 2019 Linux Plumbers Conference! Even after the success of last year’s Containers Microconference, there’s still more to work on this year.

Last year had a security focus that featured seccomp support and LSM namespacing and stacking, but now the need to look at the next steps and sets of blockers for those needs to be discussed.

Since last year’s Linux Plumbers in Vancouver, binderfs has been accepted into mainline, but more work is needed in order to fully support Android containers.

Another improvement since Vancouver is that shiftfs is now functional and included in Ubuntu, however, more work is required (including changes to VFS) before shiftfs can be accepted into mainline.

CGroup V2 is an ongoing task needing more work, with one topic of particular interest being feature parity with V1.

Additional important discussion topics include:

Come join us and participate in the discussion on what holds “The Cloud” together.

June 04, 2019 02:05 PM

June 03, 2019

Pete Zaitcev: Pi-hole

With the recent move by Google to disable the ad-blockers in Chrome (except for Enterprise level customers[1]), the interest is sure to increase for methods of protection against the ad-delivered malware, other than browser plug-ins. I'm sure Barracuda will make some coin if it's still around. And on the free software side, someone is making an all-in-one package for Raspberry Pi, called "Pi-hole". It works by screwing with DNS, which is actually an impressive demonstration of what an attack on DNS can do.

An obvious problem with Pi-hole is what happens to laptops when they are outside of the home site protection. I suppose one could devise a clone of Pi-hole that plugs into the dnsmasq. Every Fedora system runs one, because NM needs it in order to support the correct lookup on VPNs {Update: see below}. The most valuable part of Pi-hole is the blocklist, the rest is just scripting.

[1] "Google’s Enterprise ad-blocking exception doesn’t seem to include G Suite’s low and mid-tier subscribers. G Suite Basic is $6 per user per month and G Suite Business is $12 per user month."

UPDATE: Ouch. A link by Roy Schestovitz made me remember how it actually worked, and I was wrong above: NM does not run dnsmasq by default. It only has a capability to do so, if you want DNS lookup on VPNs work correctly. So, every user of VPN enables "dns=dnsmasq" in NM. But it is not the default.

UPDATE: A reader mentions that he was rooted by ads served by Space.com. Only 1 degree of separation (beyond Windows in my family).

June 03, 2019 02:37 PM

May 28, 2019

Kees Cook: security things in Linux v5.1

Previously: v5.0.

Linux kernel v5.1 has been released! Here are some security-related things that stood out to me:

introduction of pidfd
Christian Brauner landed the first portion of his work to remove pid races from the kernel: using a file descriptor to reference a process (“pidfd”). Now /proc/$pid can be opened and used as an argument for sending signals with the new pidfd_send_signal() syscall. This handle will only refer to the original process at the time the open() happened, and not to any later “reused” pid if the process dies and a new process is assigned the same pid. Using this method, it’s now possible to racelessly send signals to exactly the intended process without having to worry about pid reuse. (BTW, this commit wins the 2019 award for Most Well Documented Commit Log Justification.)

explicitly test for userspace mappings of heap memory
During Linux Conf AU 2019 Kernel Hardening BoF, Matthew Wilcox noted that there wasn’t anything in the kernel actually sanity-checking when userspace mappings were being applied to kernel heap memory (which would allow attackers to bypass the copy_{to,from}_user() infrastructure). Driver bugs or attackers able to confuse mappings wouldn’t get caught, so he added checks. To quote the commit logs: “It’s never appropriate to map a page allocated by SLAB into userspace” and “Pages which use page_type must never be mapped to userspace as it would destroy their page type”. The latter check almost immediately caught a bad case, which was quickly fixed to avoid page type corruption.

LSM stacking: shared security blobs
Casey Shaufler has landed one of the major pieces of getting multiple Linux Security Modules (LSMs) running at the same time (called “stacking”). It is now possible for LSMs to share the security-specific storage “blobs” associated with various core structures (e.g. inodes, tasks, etc) that LSMs can use for saving their state (e.g. storing which profile a given task confined under). The kernel originally gave only the single active “major” LSM (e.g. SELinux, Apprmor, etc) full control over the entire blob of storage. With “shared” security blobs, the LSM infrastructure does the allocation and management of the memory, and LSMs use an offset for reading/writing their portion of it. This unblocks the way for “medium sized” LSMs (like SARA and Landlock) to get stacked with a “major” LSM as they need to store much more state than the “minor” LSMs (e.g. Yama, LoadPin) which could already stack because they didn’t need blob storage.

SafeSetID LSM
Micah Morton added the new SafeSetID LSM, which provides a way to narrow the power associated with the CAP_SETUID capability. Normally a process with CAP_SETUID can become any user on the system, including root, which makes it a meaningless capability to hand out to non-root users in order for them to “drop privileges” to some less powerful user. There are trees of processes under Chrome OS that need to operate under different user IDs and other methods of accomplishing these transitions safely weren’t sufficient. Instead, this provides a way to create a system-wide policy for user ID transitions via setuid() (and group transitions via setgid()) when a process has the CAP_SETUID capability, making it a much more useful capability to hand out to non-root processes that need to make uid or gid transitions.

ongoing: refcount_t conversions
Elena Reshetova continued landing more refcount_t conversions in core kernel code (e.g. scheduler, futex, perf), with an additional conversion in btrfs from Anand Jain. The existing conversions, mainly when combined with syzkaller, continue to show their utility at finding bugs all over the kernel.

ongoing: implicit fall-through removal
Gustavo A. R. Silva continued to make progress on marking more implicit fall-through cases. What’s so impressive to me about this work, like refcount_t, is how many bugs it has been finding (see all the “missing break” patches). It really shows how quickly the kernel benefits from adding -Wimplicit-fallthrough to keep this class of bug from ever returning.

stack variable initialization includes scalars
The structleak gcc plugin (originally ported from PaX) had its “by reference” coverage improved to initialize scalar types as well (making “structleak” a bit of a misnomer: it now stops leaks from more than structs). Barring compiler bugs, this means that all stack variables in the kernel can be initialized before use at function entry. For variables not passed to functions by reference, the -Wuninitialized compiler flag (enabled via -Wall) already makes sure the kernel isn’t building with local-only uninitialized stack variables. And now with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled, all variables passed by reference will be initialized as well. This should eliminate most, if not all, uninitialized stack flaws with very minimal performance cost (for most workloads it is lost in the noise), though it does not have the stack data lifetime reduction benefits of GCC_PLUGIN_STACKLEAK, which wipes the stack at syscall exit. Clang has recently gained similar automatic stack initialization support, and I’d love to this feature in native gcc. To evaluate the coverage of the various stack auto-initialization features, I also wrote regression tests in lib/test_stackinit.c.

That’s it for now; please let me know if I missed anything. The v5.2 kernel development cycle is off and running already. :)

© 2019, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

May 28, 2019 03:49 AM

May 27, 2019

Linux Plumbers Conference: Distribution Kernels Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Distribution Kernels Microconference has been accepted to the 2019 Linux Plumbers Conference. This is the
first time Plumbers has offered a microconference focused on kernel distribution collaboration.

Linux distributions come in many forms, ranging from community run distributions like Debian and Gentoo, to commercially supported ones offered by SUSE or Red Hat, to focused embedded distributions like Android or Yocto. Each of these distributions maintains a kernel, making choices related to features and stability. The focus of this track is on the pain points distributions face in maintaining their chosen kernel and common solutions every distribution can benefit from.

Example topics include:

“Distribution kernel” is used in a very broad manner. If you maintain a kernel tree for use by others, we welcome you to come and share your experiences.

Here is a list of proposed topics. For Linux Plumbers 2019, new topics for microconferences can be submitted via the Call for Proposals (CfP) interface. Please visit the CfP page for more information.

May 27, 2019 05:24 PM

May 26, 2019

Linux Plumbers Conference: Linux Plumbers Earlybird Registration Quota Reached, Regular Registration Opens 30 June

A few days ago we added more capacity to the earlybird registration quota, but that too has now filled up, so your next opportunity to register for Plumbers will be Regular Registration on 30 June … or alternatively the call for presentations to the refereed track is still open and accepted talks will get a free pass.

Quotas were added a few years ago to avoid the entire conference selling out months ahead of time and accommodate attendees whose approval process takes a while or whose company simply won’t allow them to register until closer to the date of the conference.

May 26, 2019 04:54 PM

May 22, 2019

Linux Plumbers Conference: Additional early bird slots available for LPC 2019

The Linux Plumbers Conference (LPC) registration web site has been showing “sold out” recently because the cap on early bird registrations
was reached. We are happy to report that we have reviewed the registration numbers for this year’s conference and were able to open more early bird registration slots. Beyond that, regular registration will open July 1st. Please note that speakers and microconference runners get free passes to LPC, as do some microconference presenters, so that may be another way to attend the conference. Time is running out for new refereed-track and microconference proposals, so visit the CFP page soon. Topics for accepted microconferences are welcome as well.

LPC will be held in Lisbon, Portugal from Monday, September 9 through Wednesday, September 11.

We hope to see you there!

May 22, 2019 01:03 PM

May 20, 2019

James Morris: Linux Security Summit 2019 North America: CFP / OSS Early Bird Registration

The LSS North America 2019 CFP is currently open, and you have until May 31st to submit your proposal. (That’s the end of next week!)

If you’re planning on attending LSS NA in San Diego, note that the Early Bird registration for Open Source Summit (which we’re co-located with) ends today.

You can of course just register for LSS on its own, here.

May 20, 2019 08:56 PM

Linux Plumbers Conference: Tracing Microconference Accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the Tracing Microconference has been accepted into the 2019 Linux Plumbers Conference! Its return to Linux Plumbers shows that tracing is not finished in Linux, and there continue to be challenging problems to solve.

There’s a broad list of ways to perform Tracing in Linux. From the original mainline Linux tracer, Ftrace, to profiling tools like perf, more complex customized tracing like BPF and out-of-tree tracers like LTTng, systemtap, and Dtrace. Part of the trouble with tracing within Linux is that there is so much to choose from. Each of these have their own audience, but there is a lot of overlap. This year’s theme is to find those common areas and combine them into common utilities.

There is also a lot of new work that is happening and discussions between top maintainers will help keep everyone in sync, and provide good direction for the future.

Expected topics include:

Come and join us and not only learn but help direct the future progress of tracing inside the Linux kernel and beyond!

Here is a list of proposed tracing topics. For Linux Plumbers 2019, new topics for microconferences can be submitted via the Call for Proposals (CfP) interface. Please visit the CfP page for more information.

We hope to see you there!

May 20, 2019 05:37 PM

Pete Zaitcev: Google Fi

Seen an amusing blog post today on the topic of the hideous debacle that is Google Fi (on top of being a virtual network). Here's the best part though:

About a year ago I tried to get my parents to switch from AT&T to Google Fi. I even made a spreadsheet for my dad (who likes those sorts of things) about how much money he could save. He wasn’t interested. His one point was that at anytime he can go in and get help from an AT&T rep. I kept asking “Who cares? Why would you ever need that?”. Now I know. He was paying almost $60 a month premium for the opportunity to able to talk to a real person, face-to-face! I would gladly pay that now.

Respect your elders!

May 20, 2019 03:39 PM

Ted Tso: Switching to Hugo

With the demise of Google+, I’ve decided to try to resurrect my blog. Previously, I was using Wordpress, but I’ve decided that it’s just too risky from a security perspective. So I’ve decided my blog over to Hugo.

A consequence of this switch is that all of the Wordpress comments have been dropped, at least for now.

May 20, 2019 03:19 AM

May 14, 2019

Dave Airlie (blogspot): Senior Job in Red Hat graphics team

We have a job in our team, it's a pretty senior role, definitely want people with lots of experience. Great place to work,ignore any possible future mergers :-)

https://global-redhat.icims.com/jobs/68911/principal-software-engineer/job?mobile=false&width=1526&height=500&bga=true&needsRedirect=false&jan1offset=600&jun1offset=600

May 14, 2019 09:07 PM

May 10, 2019

Linux Plumbers Conference: RISC-V microconference accepted for the 2019 Linux Plumbers Conference

The open nature of the RISC-V ecosystem has allowed contributions from both academia and industry leading to an unprecedented number of new hardware design proposals in a very short time span. Linux support is the key to enabling these new hardware options. Since last year’s Plumbers, many kernel features were added to RISC-V. To name a few, we now have out-of-box 32-bit and eBPF support, some key issues with Linux boot process have been addressed, and hypervisor support is on its way.

Last year’s RISC-V microconference was such a success that we would like to repeat that again this year by focusing on finding solutions and discussing ideas that require kernel changes.

Topics for this year microconference are expected to cover:

If you’re interested in participating in this microconference, please contact Atish Patra (atish.patra@wdc.com) or Palmer Dabbelt (palmer@dabbelt.com) . For Linux Plumbers 2019, new topics for microconferences can be submitted via the Call for Proposals (CfP) interface. Please visit the CfP page for more information.

LPC will be held in Lisbon, Portugal from Monday, September 9 through Wednesday, September 11.

We hope to see you there!

May 10, 2019 07:55 PM

May 09, 2019

Davidlohr Bueso: Linux v5.1: Performance Goodies

sched/wake_q: reduce atomic operations for special users

Some core users of wake_qs, futex and rwsems were incurring in double task reference counting - which was a side effect for safety reasons. This change levels the call's performance with the rest of the users.
[Commit 07879c6a3740]

irq: Speedup for interrupt statistics in /proc/stat

On large systems with a large amount of interrupts the readout of /proc/stat takes a long time to sum up the interrupt statistics.  The reason for this is that interrupt statistics are accounted per cpu. So the /proc/stat logic has to sum up the interrupt stats for each interrupt. While applications shouldn't really be doing this to a point where it creates bottlenecks, the fix was fairly easy.
[Commit 1136b0728969]

mm/swapoff: replace quadratic complexity for lineal

try_to_unuse() is of quadratic complexity, with a lot of wasted effort. It unuses swap entries one by one, potentially iterating over all the page tables for all the processes in the system for each one. With these changes, it now iterates over the system's mms once, unusing all the affected entries as it walks each set of page tables.

Improvements show time reductions for swapoff being called on a swap partition containing about 6G of data, from 8 to 3 minutes.
[Commit c5bf121e4350 b56a2d8af914]

  mm: make pinned_vm an atomic counter

This reduces some of the bulky mmap_sem games that are played when, mostly rdma, deals with the pinned pages counter. It also pivots on not relying on the lock for get user pages operations.
[Commit 70f8a3ca68d3 3a2a1e90564e b95df5e3e459]

drivers/async: NUMA aware async_schedule calls

Asynchronous function calls reduces, primarily, kernel boot time by safely doing out of order operations, such as device discovery. This series improves the NUMA locality by being able to schedule device specific init work on specific NUMA nodes in order to improve performance of memory initialization. Significant init reduction times for persistent memory were seen.
[Commit 3451a495ef24 ed88747c6c4a ef0ff68351be 8204e0c1113d 6be9238e5cb6 c37e20eaf4b2 8b9ec6b73277 af87b9a7863c 57ea974fb871]

lib/iov_iter: optimize page_copy_sane()

This avoid cacheline misses when dereferencing a struct page, via compound_head(), when possible. Apparently the overhead was visible on TCP doing recvmsg() calls dealing with GRO packets.
[Commit 6daef95b8c91]

fs/epoll: reduce lock contention in ep_poll_callback()

This patch increases the bandwidth of events which can be delivered from sources to the poller by adding poll items in a lockless way to the ready list; via clever ways of xchg() while holding a reader rwlock . This improves scenarios with multiple threads generating IO events which are delivered to a single threaded epoll_wait()er.
[Commit c141175d011f c3e320b61581 a218cc491420]

fs/nfs: reduce cost of listing huge directories (readdirplus)

When listing very large directories via NFS, clients may take a long time to complete. Most of the culprit is in various degrees of libc's readdir(2) reading 32k files at a time. To improve performance and reduce the amount of rpc calls, NFS readdirplus rpc will ask for a large data (more than 32k), the data can fill more than one page, the cached pages can be used for next readdir call. Benchmarks show rpc calls decreasing by 85% while listing a directory with 300k files.
[Commit be4c2d4723a4]

fs/pnfs: Avoid read/modify/write when it is not necessary

When testing with fio, Throughput of overwrite (both buffered and O_SYNC) is noticeably improved.
[Commit 97ae91bbf3a7 2cde04e90d5b]

May 09, 2019 08:10 PM

Davidlohr Bueso: Linux v5.0: Performance Goodies

mm/page-alloc: reduce zone->lock contention

Contention in the page allocator was seen in a network traffic report, in which order-0 allocations are being freed by back to the directly to the buddy, instead of making use of percpu-pages in the page_frag_free() call. Aside from eliminating the contention, it was seen to improve some microbenchmarks.
[Commit 65895b67ad27]

mm/mremap: improve scalability on large regions

When THP is disabled, move_page_tables() can bottleneck a large mremap() call, as it will copy each pte at a time. This patch speeds up the performance by copying at the PMD level when possible. Up to 20x speedups were seen when doing a 1Gb remap.
[Commit 2c91bd4a4e2e]

mm: improve anti-fragmentation

Given sufficient time or an adverse workload, memory gets fragmented and the long-term success of high-order allocations degrades. Overall the series reduces external fragmentation causing events by over 94% on 1 and 2 socket machines, which in turn impacts high-order allocation success rates over the long term.
[Commit 6bb154504f8b a921444382b4 0a79cdad5eb2 1c30844d2dfe]

mm/hotplug: optimize clear hw_poisoned_pages()

During hotplug remove, the kernel will loop for the respective number of pages looking for poisoned pages. Check the atomic hint in case this are none, and optimize the function.
[Commit 5eb570a8d924]

mm/ksm: Replace jhash2 with xxhash

xxhash is an extremely fast non-cryptographic hash algorithm for checksumming, making it suitable to use in kernel samepage merging. On a custom KSM benchmark, throughput was seen to improve from 1569 to 8770 MB/s.

genirq/affinity: Spread IRQs to all available NUMA nodes

If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts which are allocated for a device, the interrupt affinity spreading code fails to spread them across all nodes. NUMA nodes above the number of interrupts are all assigned to hardware queue 0 and therefore NUMA node 0, which results in bad performance and has CPU hotplug implications. Fix this by assigning via round-robin.
[Commit b82592199032]

fs/epoll: Optimizations for epoll_wait()

Various performance changes oriented towards improving the waiting side, such that contention epoll waitqueue (previously ep->lock) spinlock is reduced. This produces pretty good results for various concurrent epoll_wait(2) benchmarks.
[Commit 74bdc129850c 4e0982a00564 76699a67f304 21877e1a5b52 c5a282e9635e abc610e01c66 86c051793b4c]

lib/sbitmap: Various optimizations

Two optimizations to the sbitmap core were introduced, which is used, for example, by the block-mq tags. The first optimizes wakeup checks and adds to the core api, while the second introduces batched clearing of bits, trading 64 atomic bitops for 2 cmpxchg calls.
[Commit 5d2ee7122c73 ea86ea2cdced]

fs/locks: Avoid thundering herd wakeups

When one thread releases a lock on a given file, it wakes up all other threads that are waiting (classic thundering-herd) - one will get the lock and the others go to sleep.  The overhead starts being noticeable with increasing thread counts. These changes create a tree of pending lock request in which siblings don't conflict and each lock request does conflict with its parent. When a lock is released, only requests which don't conflict with each other a woken.

Testing shows that lock-acquisitions-per-second is now fairly stable even as number of contending process goes to 1000. Without this patch, locks-per-second drops off steeply after a few 10s of processes. Micro-benchmarks can be found per the lockscale program, which tests fcntl(..., F_OFD_SETLKW, ...) and flock(..., LOCK_EX) calls.

arm64/lib: improve crc32 performance for deep pipelines

This change replace most branches with a branchless code path that overlaps 16 byte loads to process the first (length % 32) bytes, and process the remainder using a loop that processes 32 bytes at a time.
[Commit efdb25efc764]

May 09, 2019 08:10 PM

Michael Kerrisk (manpages): man-pages-5.01 is released

I've released man-pages-5.01. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from just over 20 contributors. The release is smaller release than typical; it includes just over 70 commits that changed just over 40 pages.

The most notable of the changes in man-pages-5.01 is the following:


May 09, 2019 12:10 PM

May 04, 2019

Pete Zaitcev: YAML

Seen in a blog entry by Martin Tournoij (via):

I’ve been happily programming Python for over a decade, so I’m used to significant whitespace, but sometimes I’m still struggling with YAML. In Python the drawbacks and loss of clarity are contained by not having functions that are several pages long, but data or configuration files have no such natural limits to their length.

[...]

YAML may seem ‘simple’ and ‘obvious’ when glancing at a basic example, but turns out it’s not. The YAML spec is 23,449 words; for comparison, TOML is 3,339 words, JSON is 1,969 words, and XML is 20,603 words.

There's more where the above came from. In particular, the portability issues are rather surprising.

Unfortunately for me, OpenStack TripleO is based on YAML.

May 04, 2019 06:48 PM

Linux Plumbers Conference: BPF microconference accepted into 2019 Linux Plumbers Conference

We are pleased to announce that the BPF microconference has been accepted into the 2019 Linux Plumbers Conference! Last year’s BPF microconference was such a success that it will be held again this year.

BPF along with its just-in-time (JIT) compiler inside the Linux kernel allows for versatile programmability of the kernel and plays a major role in networking (XDP, tc BPF, etc.), tracing (kprobes, uprobes, tracepoints) and security (seccomp, landlock) subsystems.

Since last year’s Plumbers Conference, many of the discussed improvements have been tackled and found their way into the Linux kernel such as significant steps towards allowing for a compile-once paradigm with the help of BTF and global data support as well as considerable verifier scalability improvements to name a few. The topics proposed for this year’s event include:

– libbpf, loader unification
– Standardized BPF ELF format
– Multi-object semantics and linker-style logic for BPF loaders
– Verifier scalability work towards 1 million instructions
– Sleepable BPF programs
– BPF loop support
– Indirect calls in BPF
– Unprivileged BPF
– BPF type format (BTF)
– BPF timers
– bpftool
– LLVM BPF backend, JITs and BPF offloading
– and more

Come join us and participate in the decision making of one of the most cutting edge advancements in the Linux kernel!

See here for a detailed preview of the proposed and accepted topics. For Linux Plumbers 2019, new topics for microconferences can be submitted
via the Call for Proposals (CfP) interface. Please visit the CfP page for more information.

We hope to see you there!

May 04, 2019 01:31 AM

May 02, 2019

Pete Zaitcev: Fraud in the material world

Wow, they better not be building Boeings from this crap:

NASA Launch Services Program (LSP) investigators have determined the technical root cause for the Taurus XL launch failures of NASA’s Orbiting Carbon Observatory (OCO) and Glory missions in 2009 and 2011, respectively: faulty materials provided by aluminum manufacturer, Sapa Profiles (SPI). LSP’s technical investigation led to the involvement of NASA’s Office of the Inspector General and the U.S. Department of Justice (DOJ). DOJ’s efforts, recently made public, resulted in the resolution of criminal charges and alleged civil claims against SPI, and its agreement to pay $46 million to the U.S. government and other commercial customers. This relates to a 19-year scheme that included falsifying thousands of certifications for aluminum extrusions to hundreds of customers.

BTW, those costly failures probably hastened the sale of Orbital to ATK in 2015. There were repercussions for the personnell running the Taurus program as well.

May 02, 2019 05:52 PM

Daniel Vetter: Upstream First

lwn.net just featured an article the sustainability of open source, which seems to be a bit a topic in various places since a while. I’ve made a keynote at Siemens Linux Community Event 2018 last year which lends itself to a different take on all this:

The slides for those who don’t like videos.

This talk was mostly aimed at managers of engineering teams and projects with fairly little experience in shipping open source, and much less experience in shipping open source through upstream cross vendor projects like the kernel. It goes through all the usual failings and missteps and explains why an upstream first strategy is the right one, but with a twist: Instead of technical reasons, it’s all based on economical considerations of why open source is succeeding. Fundamentally it’s not about the better software, or the cheaper prize, or that the software freedoms are a good thing worth supporting.

Instead open source is eating the world because it enables a much more competitive software market. And all the best practices around open development are just to enable that highly competitive market. Instead of arguing that open source has open development and strongly favours public discussions because that results in better collaboration and better software we put on the economic lens, and private discussions become insider trading and collusions. And that’s just not considered cool in a competitive market. Similar arguments can be made with everything else going on in open source projects.

Circling back to the list of articles at the top I think it’s worth looking at the sustainability of open source as an economic issue of an extremely competitive market, in other words, as a market failure: Occasionally the result is that no one gets paid, the customers only receive a sub-par product with all costs externalized - costs like keeping up with security issues. And like with other market failures, a solution needs to be externally imposed through regulations, taxation and transfers to internalize all the costs again into the product’s prize. Frankly no idea how that would look like in practice though.

Anyway, just a thought, but good enough a reason to finally publish the recording and slides of my talk, which covers this just in passing in an offhand remark.

Update: Fix slides link.

May 02, 2019 12:00 AM

April 24, 2019

Linux Plumbers Conference: Lots of microconferences proposed for LPC

Microconference proposals have been rolling in for the 2019 Linux Plumbers Conference, but it is not too late to submit more. So far, we have the following microconference proposals:

If you have suggestions for topics to be discussed in those microconferences, please email contact@linuxplumbersconf.org to connect with the microconference runners.

Other microconference topic areas are still welcome, please go to the CFP page to submit yours today!

April 24, 2019 01:36 PM

April 13, 2019

Linux Plumbers Conference: Registration is open for the 2019 Linux Plumbers Conference

Registration is now open for the 2019 edition of the Linux Plumbers Conference (LPC). It will be held September 9-11 in Lisbon, Portugal with dedicated Linux Kernel Summit and Networking tracks, as was done last year, along with the microconferences and refereed presentations that are LPC standards. Go to the registration site to sign up or the attend page for more information on dates and quotas for the various registration types. Early registration will run until June 30 or until the quota is filled.

Note that the CFPs for microconferences, refereed track talks, and BoFs are still open, please see this page for more information.

As always, please contact the organizing committee if you have questions.

April 13, 2019 09:23 AM

April 12, 2019

Linux Plumbers Conference: Linux Plumbers Conference 2019 Call for Bird of a Feather (BoF) Session Proposals

On the heels of the previous announcements, we are also pleased to announce the Bird of a Feather (BoF) Session Proposals for the 2019 edition of the Linux Plumbers Conference, which will be held in Lisbon, Portugal on September 9-11 in conjunction with the Linux Kernel Maintainer Summit.

BoFs are free-form get-togethers for people wishing to discuss a particular topic. As always, you only need to submit proposals for BoFs you want to hold on-site. In contrast, and again as always, informal BoFs may be held at local drinking establishments or in the “hallway track” at your convenience.

For more information on submitting a BoF session proposal, see the following:

https://www.linuxplumbersconf.org/event/4/abstracts

Please note that the submission system is the same as 2018. If you created an user account last year, you will be able to re-use the same credentials to submit and modify your proposal(s) this year.

The call for Microconferences and Refereed-Track proposals are also open, and we hope to see you in Lisbon this coming September!

April 12, 2019 08:05 AM

April 10, 2019

Paul E. Mc Kenney: Confessions of a Recovering Proprietary Programmer, Part XVI

I build quite a few Linux kernels, mostly in support of my deep and abiding rcutorture habit. These builds can take some time, even on modern laptops, but they are nevertheless amazingly fast compared to the build times of the much smaller projects I worked on in decades past. Additionally, build times are way down in the noise when I am doing multi-hour rcutorture runs. So much so that I don't bother with cut-down kernel configurations, especially given that cut-down configurations are an excellent way to fail to spot subtle RCU API problems.

Still, faster builds do have their advantages, especially when doing a series of short tests, such as when chasing down that rarest of creatures, an RCU bug that reproduces reliably within a few minutes of boot. Which is exactly what I was doing yesterday. And during that time, a five-minute kernel build time was much more annoying than it normally would be.

But that is why we have ccache, a tool that is considerably more attractive than it was back when my laptop's mass storage weighed in at “only” a few tens of gigabytes. With a bit of help from here, here, and the ccache man page, I got ccache up and running, and somewhat later got it actually making kernel builds go faster. Sometimes considerably more than an order of magnitude faster!

But I do get spoiled really quickly.

You see, the first ccache build goes no faster than a normal build because the cache is initially empty. And yes, a five, six, or even seven-minute build was just fine a couple of days ago: After all, there is always some small task that needs to be done. But having just witnessed builds completing in way less than one minute, even a five-minute wait now seemed horribly slow. And a five-minute build is what I get the first time I run a given rcutorture scenario. Or after I modify an rcutorture scenario. Or if I specify unusual arguments to rcutorture's --kconfig command-line option. Or if I modify a heavily used include file. Or when I configured ccache's cache size too small.

Nevertheless, I most definitely should have installed ccache a very long time ago! :-)

April 10, 2019 09:37 PM

April 08, 2019

Linux Plumbers Conference: Results from the 2018 LPC survey

Thank you to everyone who participated in the survey after Linux Plumbers in 2018. We had 134 responses, which, given the total number of conference participants of around 492, has provided confidence in the feedback.

Overall: 85% of respondents were positive about the event, with only 2% actually saying they were dissatisfied. Co-locating with Kernel Summit proved popular, so we will be co-locating with Kernel Summit in 2019. Co-locating with Networking Summit was also well received, so we will be doing that again this year, too. Conference participation was up from 2017 and we sold out again this year. 98% of those that registered were able to attend.

Based on feedback from last year’s survey, we videotaped all of the sessions, and the videos are now posted. There are over 100 hours of video in our YouTube channel or you can access them by visiting the detailed schedule and clicking on the video link in the presentation materials section of any given talk or discussion. The Microconferences are recorded as one long video block, but clicking on the video link of a particular discussion topic will take you to the time index in that file where the chosen discussion begins.

Venue: 67% of survey respondents considered the size of attendees to be just right, however 25% would have like to have seen more able to attend. In general, 43% of respondents considered the venue size to be a good match, but a significant portion would have preferred it to be bigger (45%) as well. The room size was considered effective for participation by 95% of the respondents, however there was a clear indication in the comments that we need to figure out a better way to allocate rooms based on expected participants, as some ended up overflowing. There is some desire for additional electrical outlets to be made available, which will be looked into for the 2019 event.

Content: In terms of track feedback, Linux Plumbers Refereed track and Kernel Summit track were indicated as very relevant by almost all respondents who attended. The Networking track had fewer participants responding on the survey, but was positively reviewed as well. Hallway track continues to be regarded as very relevant, and appreciated.

Communication: This year we had a new website, and participants were able to navigate through it and find the session needed. In the feedback, there were some requests to integrate scheduling app capabilities (and attendee room size); the committee will look into options for that.

Events: Craft Beer was the most popular event and had favorable feedback from respondents. There were some concerns expressed in the written feedback that we didn’t clarify there were non-alcoholic options available there, and we’ll take note to communicate this better in future. The final closing event venue was originally planned for conference attendance similar to the prior year; the increase of 20% to 492 attendees, impacted this event, and the perception was that it was too crowded and had insufficient food from the comments.

There were lots of great suggestions to the “what one thing would you like to see changed” question, and the program committee has been studying them to see what is possible to implement this year. Thank you again to the participants for their input and help on making the Linux Plumbers Conference better in 2019 and the future.

April 08, 2019 03:30 PM

March 30, 2019

James Bottomley: A Roadmap for Eliminating Patents in Open Source

The realm of Software Patents is often considered to be a fairly new field which isn’t really influenced by anything else that goes on in the legal lansdcape. In particular there’s a very old field of patent law called exhaustion which had, up until a few years ago, never been applied to software patents. This lack of application means that exhaustion is rarely raised as a defence against infringement and thus it is regarded as an untested strategy. Van Lindberg recently did a FOSDEM presentation containing interesting ideas about how exhaustion might apply to software patents in the light of recent court decisions. The intriguing possibility this offers us is that we may be close to an enforceable court decision (at least in the US) that would render all patents in open source owned by community members exhausted and thus unenforceable. The purpose of this blog post is to explain the current landscape and how we might be able to get the necessary missing court decisions to make this hope a reality.

What is Patent Exhaustion?

Patent law is ancient, going back to Greece in around 500BC. However, every legal system has been concerned that patent holders, being an effective monopoly with the legal right to exclude others, did not abuse that monopoly position. This lead to the concept that if you used your monopoly power to profit, you should only be able to do it once for the same item so that absolute property rights couldn’t be clouded by patents. This leads to something called the exhaustion doctrine: so if Alice holds a patent on some item which she sells to Bob and Bob later sells the same item to Charlie, Alice can’t force Bob or Charlie to give her a part of their sale proceeds in exchange for her allowing Charlie to practise the patent on the item. The patent rights are said to be exhausted with the sale from Alice to Bob, so there are no patent rights left to enforce on Charlie. The exhaustion doctrine has since been expanded to any authorized transfer, even if no money changes hands (so if Alice simply gave Bob the item instead of selling it, the patent still exhausts at that transaction and Bob is still free to give or sell the item to Charlie without interference from Alice).

Of course, modern US patent rights have been around now for two centuries and in that time manufacturers have tried many ingenious schemes to get around the exhaustion doctrine profitably, all of which have so far failed in the courts, leading to quite a wealth of case law on the subject. The most interesting recent example (Lexmark v Impression) was over whether a patent holder could use their patent power to enforce any onward conditions at all for which the US Supreme Court came to the conclusive finding: they can’t and goes on to say that all patent rights in the item terminate in the first authorized transfer. That doesn’t mean no post sale conditions can be imposed, they can by contract or licence or other means, it just means post sale conditions can’t be enforced by patent actions. This is the bind for Lexmark: their sales contracts did specify that empty cartridges couldn’t be resold, so their customers violated that contract by selling the cartridges to Impression to refill and resell. However, that contract was between Lexmark and the customer not Lexmark and Impression, so absent patent remedies Lexmark has no contractual case against Impression, only against its own customers.

Can Exhaustion apply if Software isn’t actually sold?

The exhaustion doctrine actually has an almost identical equivalent for copyright called the First Sale doctrine. Back when software was being commercialized, no software distributor liked the idea that copyright in software exhausts after it is sold, so the idea of licensing instead of selling software was born, which is why you always get that end user licence agreement for software you think you bought. However, this makes all software (including open source) a very tricky for patent exhaustion because there’s no first sale to exhaust the rights.

The idea that Exhaustion didn’t have to involve an exchange of something (so became authorized transfer instead of first sale) in US law is comparatively recent, dating to a 2013 decision LifeScan v Shasta where one point won on appeal was that giving away devices did exhaust the patent. The idea that authorized transfer could extend to software downloads really dates to Cascades v Samsung in 2014.

The bottom line is that exhaustion does apply to software and downloading is an authorized transfer within the meaning of the Exhaustion Doctrine.

The Implications of Lexmark v Impression for Open Source

The precedent for Open Source is quite clear: Patents cannot be used to impose onward conditions that the copyright licence doesn’t. For instance the Open Air Interface 5G alliance public licence attempts just such a restriction in clause 3 “Grant of Patent License” where it tries to restrict the grant to being only if you use the source for “study and research” otherwise you need an additional patent licence from OAI. Lexmark v Impressions makes that clause invalid in the licence: once you obtain open source under the OAI licence, the OAI patents exhaust at that point and there are no onward patent rights left to enforce. This means that source distributed under OAI can be reused under the terms of the copyright licence (which is permissive) without any fear of patent restrictions. Now OAI can still amend its copyright licence to impose the field of use restrictions and enforce them via copyright means, it just can’t use patents to do so.

FRAND and Open Source

There have recently been several attempts to claim that FRAND patent enforcement and Open Source licensing can be compatible, or more specifically a FRAND patent pool holder like a Standards Development Organization can both produce an Open Source reference implementation and still collect patent Royalties. This looks to be wrong, however; the Supreme Court decision is clear: once a FRAND Patent pool holder distributes any code, that distribution is an authorized transfer within the meaning of the first sale doctrine and all FRAND pool patents exhaust at that point. The only way to enforce the FRAND royalty payments after this would be in the copyright licence of the code and obviously such a copyright licence, while legal, would not be remotely an Open Source licence.

Exhausting Patents By Distribution

The next question to address is could patents become exhausted simply because the holder distributed Open Source code in any form? As I said before, there is actually a case on point for this as well: Cascades v Samsung. In this case, Cascades tried to sue Samsung for violating a patent on the Dalvik JIT engine in AOSP. Cascades claimed they had licensed the patent to Google for a payment only for use in Google products. Samsung claimed exhaustion because Cascades had licensed the patent to Google and Samsung downloaded AOSP from Google. The court agreed with this and dismissed the infringement action. Case closed, right? Not so fast: it turns out Cascades raised a rather silly defence to Samsung’s claim of exhaustion, namely that the authorized transfer under the exhaustion doctrine didn’t happen until Samsung did the download from Google, so they were still entitled to enforce the Google products only restriction. As I said in the beginning courts have centuries of history with manufacturers trying to get around the exhaustion doctrine and this one crashed and burned just like all the others. However, the question remains: if Cascades had raised a better defence to the exhaustion claim, would they have prevailed?

The defence Cascades could have raised is that Samsung didn’t just download code from Google, they also copied the code they downloaded and those copies should be covered under the patent right to exclude manufacture, which didn’t exhaust with the download. To illustrate this in the Alice, Bob, Charlie chain: Alice sells an item to Bob and thus exhausts the patent so Bob can sell it on to Charlie unencumbered. However that exhaustion does not give either Bob or Charlie the right to manufacture a new copy of the item and sell it to Denise because exhaustion only applies to the same item Alice sold, not to a newly manufactured copy of that item.

The copy as new manufacture defence still seems rather vulnerable on two grounds: first because Samsung could download any number of exhausted copies from Google, so what’s the difference between them downloading ten copies and them downloading one copy and then copying it themselves nine times. Secondly, and more importantly, Cascades already had a remedy in copyright law: their patent licence to Google could have required that the AOSP copyright licence be amended not to allow copying of the source code by non-Google entities except on payment of royalties to Cascades. The fact that Cascades did not avail themselves of this remedy at the time means they’re barred from reclaiming it now via patent action.

The bottom line is that distribution exhausts all patents reading on the code you distribute is a very reasonable defence to maintain in a patent infringement lawsuit and it’s one we should be using much more often.

Exhaustion by Contribution

This is much more controversial and currently has no supporting case law. The idea is that Distribution can occur even with only incremental updates on the existing base (git pull to update code, say), so if delta updates constitute an authorized transfer under the exhaustion doctrine, then so must a patch based contribution, being a delta update from a contributor to the project, be an authorized transfer. In which case all patents which read on the project at the time of contribution must also exhaust when the contribution is made.

Even if the above doesn’t fly, it’s undeniable most contributions today are made by cloning a git tree and republishing it plus your own updates (essentially a github fork) which makes you a bona fide distributor of the whole project because it can all be downloaded from your cloned tree. Thus I think it’s reasonable to hold that all patents owned by distributors and contributors in an open source project have exhausted in that project. In other words all the arguments about the scope and extent of patent grants and patent capture in open source licences is entirely unnecessary.

Therefore, all active participants in an Open Source community ipso facto exhaust any patents on the community code as that code is redistributed.

Implications for Proprietary Software

Firstly, it’s important to note that the exhaustion arguments above have no impact on the patentability of software or the validity of software patents in general, just on their enforcement. Secondly, exhaustion is triggered by the unencumbered right to redistribute which is present in all Open Source licences. However, proprietary software doesn’t come with a right to redistribute in the copyright licence, meaning exhaustion likely doesn’t trigger for them. Thus the exhaustion arguments above have no real impact on the ability to enforce software patents in proprietary code except that one possible defence that could be raised is that the code practising the patent in the proprietary software was, in fact, legitimately obtained from an open source project under a permissive licence and thus the patent has exhausted. The solution, obviously, is that if you worry about enforceability of patents in proprietary software, always use a copyleft licence for your open source.

What about the Patent Troll Problem?

Trolls, by their nature, are not IP producing entities, thus they are not ecosystem participants. Therefore trolls, being outside the community, can pursue infringement cases unburdened by exhaustion problems. In theory, this is partially true but Trolls don’t produce anything, therefore they have to acquire their patents from someone who does. That means that if the producer from whom the troll acquired the patent was active in the community, the patent has still likely exhausted. Since the life of a patent is roughly 20 years and mass adoption of open source throughout the software industry is only really 10 years old1 there still may exist patents owned by Trolls that came from corporations before they began to be Open Source players and thus might not be exhausted.

The hope this offers for the Troll problem is that in 10 years time, all these unexhausted patents will have expired and thanks to the onward and upward adoption of open source there really will be no place for Trolls to acquire unexhausted patents to use against the software industry, so the Troll threat is time limited.

A Call to Arms: Realising the Elimination of Patents in Open Source

Your mission, should you choose to be part of this project, is to help advance the legal doctrines on patent exhaustion. In particular, if the company you work for is sued for patent infringement in any Open Source project, even by a troll, suggest they look into asserting an exhaustion based defence. Even if your company isn’t currently under threat of litigation, simply raising awareness of the option of exhaustion can help enormously.

The first case an exhaustion defence could potentially be tried is this one: Sequoia Technology is asserting a patent against LVM in the Linux kernel. However it turns out that patent 6,718,436 is actually assigned to ETRI, who merely licensed it to Sequoia for the purposes of litigation. ETRI, by the way, is a Linux Foundation member but, more importantly, in 2007 ETRI launched their own distribution of Linux called Booyo which would appear to be evidence that their own actions as a distributor of the Linux Kernel have exhaused patent 6,718,436 in Linux long before they ever licensed it to Sequoia.

If we get this right, in 10 years the Patent threat in Open Source could be history, which would be a nice little legacy to leave our children.

March 30, 2019 09:32 PM

March 29, 2019

Pete Zaitcev: Swift(stack) bragging today

On their official blog:

Over the last several months, SwiftStack has been busy helping two large autonomous vehicle customers. These data pipelines are distributed across edge (vehicle sensors) to core (data center) to cloud/multi-cloud locations, and are challenged with ingest, labeling, training, inferencing, and retaining data at scale. [...] one deployment is handling more than a petabyte of data per week, with four thousand GPU cores from NVIDIA DGX-1 servers fed with 100 GB/s of throughput from SwiftStack cluster.

I suspect the task-queue expirer could be helpful at this. Although, if you're uploading 1 PB per week, it takes about a year to fill out a cluster as big as Turkcell's.

Apparently the actual storage is provided by Cisco UCS S3260. Some of our customers use Cisco UCS to run Swift too. I always thought about Cisco as a networking company, but it's different nowadays.

March 29, 2019 03:54 AM

Daniel Vetter: X.org Elections: freedesktop.org Merger - Vote Now!

Aside from the regular board elections we also have some bylaw changes to vote on. As usual with bylaw changes, we need a supermajority of all members to agree - if you don’t vote you essentially reject it, but the board has no way of knowing.

Please see the detailed changes of the bylaws, make up your mind, and go voting on the shiny new members page.

March 29, 2019 12:00 AM