September 2014 – Code happens

So Monty and Sean have recently blogged about about the structures (1, 2) they think may work better for OpenStack. I like the thrust of their thinking but had some mumblings of my own to add.

Firstly, I very much like the focus on social structure and needs – what our users and deployers need from us. That seems entirely right.

And I very much like the getting away from TC picking winners and losers. That was never an enjoyable thing when I was on the TC, and I don’t think it has made OpenStack better.

However, the thing that picking winners and losers did was that it allowed users to pick an API and depend on it. Because it was the ‘X API for OpenStack’. If we don’t pick winners, then there is no way to say that something is the ‘X API for OpenStack’, and that means that there is no forcing function for consistency between different deployer clouds. And so this appears to be why Ring 0 is needed: we think our users want consistency in being able to deploy their application to Rackspace or HP Helion. They want vendor neutrality, and by giving up winners-and-losers we give up vendor neutrality for our users.

Thats the only explanation I can come up with for needing a Ring 0 – because its still winners and losers (e.g. picking an arbitrary project) keystone, grandfathering it in, if you will. If we really want to get out of the role of selecting projects, I think we need to avoid this. And we need to avoid it without losing vendor neutrality (or we need to give up the idea of vendor neutrality).

One might say that we must pick winners for the very core just by its, but I don’t think thats true. If the core is small, many people will still want vendor neutrality higher up the stack. If the core is large, then we’ll have a larger % of APIs covered and stable granting vendor neutrality. So a core with fixed APIs will be under constant pressure to expand: not just from developers of projects, but from users that want API X to be fixed and guaranteed available and working a particular way at [most] OpenStack clouds.

Ring 0 also fulfils a quality aspect – we can check that it all works together well in a realistic timeframe with our existing tooling. We are essentially proposing to pick functionality that we guarantee to users; and an API for that which they have everywhere, and the matching implementation we’ve tested.

To pull from Monty’s post:

“What does a basic end user need to get a compute resource that works and seems like a computer? (end user facet)

What does Nova need to count on existing so that it can provide that. ”

He then goes on to list a bunch of things, but most of them are not needed for that:

We need Nova (its the only compute API in the project today). We don’t need keystone (Nova can run in noauth mode and deployers could just have e.g. Apache auth on top). We don’t need Neutron (Nova can do that itself). We don’t need cinder (use local volumes). We need Glance. We don’t need Designate. We don’t need a tonne of stuff that Nova has in it (e.g. quotas) – end users kicking off a simple machine have -very- basic needs.

Consider the things that used to be in Nova: Deploying containers. Neutron. Cinder. Glance. Ironic. We’ve been slowly decomposing Nova (yay!!!) and if we keep doing so we can imagine getting to a point where there truly is a tightly focused code base that just does one thing well. I worry that we won’t get there unless we can ensure there is no pressure to be inside Nova to ‘win’.

So there’s a choice between a relatively large set of APIs that make the guaranteed available APIs be comprehensive, or a small set that that will give users what they need just at the beginning but might not be broadly available and we’ll be depending on some unspecified process for the deployers to agree and consolidate around what ones they make available consistently.

In sort one of the big reasons we were picking winners and losers in the TC was to consolidate effort around a single API – not implementation (keystone is already on its second implementation). All the angst about defcore and compatibility testing is going to be multiplied when there is lots of ecosystem choice around APIs above Ring 0, and the only reason that won’t be a problem for Ring 0 is that we’ll still be picking winners.

How might we do this?

One way would be to keep picking winners at the API definition level but not the implementation level, and make the competition be able to replace something entirely if they implement the existing API [and win hearts and minds of deployers]. That would open the door to everything being flexible – and its happened before with Keystone.

Another way would be to not even have a Ring 0. Instead have a project/program that is aimed at delivering the reference API feature-set built out of a single, flat Big Tent – and allow that project/program to make localised decisions about what components to use (or not). Testing that all those things work together is not much different than the current approach, but we’d have separated out as a single cohesive entity the building of a product (Ring 0 is clearly a product) from the projects that might go into it. Projects that have unstable APIs would clearly be rejected by this team; projects with stable APIs would be considered etc. This team wouldn’t be the TC : they too would be subject to the TC’s rulings.

We could even run multiple such teams – as hinted at by Dean Troyer one of the email thread posts. Running with that I’d then be suggesting

IaaS product: selects components from the tent to make OpenStack/IaaS
PaaS product: selects components from the tent to make OpenStack/PaaS
CaaS product (containers)
SaaS product (storage)
NaaS product (networking – but things like NFV, not the basic Neutron we love today). Things where the thing you get is useful in its own right, not just as plumbing for a VM.

So OpenStack/NaaS would have an API or set of APIs, and they’d be responsible for considering maturity, feature set, and so on, but wouldn’t ‘own’ Neutron, or ‘Neutron incubator’ or any other component – they would be a *cross project* team, focused at the product layer, rather than the component layer, which nearly all of our folk end up locked into today.

Lastly Sean has also pointed out that we have large N N^2 communication issues – I think I’m proposing to drive the scope of any one project down to a minimum, which gives us more N, but shrinks the size within any project, so folk don’t burn out as easily, *and* so that it is easier to predict the impact of changes – clear contracts and APIs help a huge amount there.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Month: September 2014

what-poles-for-the-tent