Introduction to OpenStack
What is OpenStack?
You've probably heard of OpenStack. It's that cloud software that's getting a lot of attention from big names in the IT industry and major users like CERN, Comcast and PayPal. However, did you know that it's more than that? It's also the fastest growing open source community in the world, and a very interesting collaboration among technology vendors and users.
OpenStack is really truly open. If you want to test it out, you can stop reading this article (come back after - we'll still be here!), visit http://status.openstack.org/releaseand get a real-time list of features that are under development. Select one, follow it to the code review system, and give your comments! This is an example of Open Development, just one of the "Four Opens" on which OpenStack was founded (https://wiki.openstack.org/wiki/Open). Of course, you likely already know that OpenStack is fully released under the Open Source Apache 2 license - no bits reserved, but did you also know about the principle of "Open Design"?
The software is released on a six-month cycle, and at the start of each cycle we host a Design Summit for the contributors and users to gather and plan out the roadmap for the next release. The Design Summit has become part of an increasingly large conference that also hosts workshops for newcomers, inspiring keynotes and some fairly amazing user stories. However, somewhere tucked away are rooms with chairs arranged in semicircular layouts ensconcing dozens of developers engaged in robust discussion, taking notes on a collaborative document displayed on a projector. This is where the roadmap of OpenStack is determined, pathways for implementations of features are aligned and people volunteer to make it reality. This is the Open Design process, and we welcome your participation!
OpenStack may have started with just two organizations, NASA and Rackspace, but now there are hundreds and with each additional member the community grows stronger and more cohesive. Every morning the OpenStack developer awakes to a torrent of email from the discussions in other countries, and the momentum is best described as 'intense'.
Overall, the standout feature of OpenStack is this strong community. It's extremely diverse, comprised of very different technical backgrounds (python developers to packagers to translators) and different philosophical backgrounds (free software evangelists to hardcore capitalists). It's also very widespread, with the OpenStack Foundation claiming membership from 130 countries as of October 2013. So, dear reader, chances are there is a place for you.
Becoming a Contributor
Over a twelve-month period, OpenStack typically has more than a thousand software developers contributing patches. Despite this, we always need more!
You can find detailed instructions on how to get started on the wiki (http://wiki.openstack.org/HowToContribute), but just to run through a couple of key aspects...
OpenStack uses Launchpad for bugs and Github for code hosting, but neither of these places for contributing code. That's right: no github pull requests are accepted. Don't Panic yet - this is for good reason: all patches to OpenStack go through an extensive code review and testing process (https://wiki.openstack.org/wiki/Gerrit_Workflow).
Every code change in OpenStack is seen by at least 3 people (the owner, and two of the core reviewers for the project), but often many more, and test cases and PEP8 compliance is required. The patches are also run through the continuous integration system, which effectively builds a new cloud for every code change submitted - ensuring that the interaction of that piece of software is as expected with all other parts of OpenStack. As a result of this quest for quality, many have found that contributing has improved their python coding skills.
Of course, not everyone is a Python developer. However, not to worry - there's still a place for you to work with us to change the face of cloud computing.
In many cases, the easiest way to become a contributor to OpenStack is to participate in the documentation efforts. It requires no coding, just a willingness to read and understand the systems that you’re writing about. Because the documentation is treated like code, you will be learning the mechanics necessary to make contributions to OpenStack itself by helping with documentation. Visit https://wiki.openstack.org/wiki/Documentation/HowTo to find out more.
If you know how to write in another language, translations of OpenStack happen through an easy-to-use web interface (https://www.transifex.com/projects/p/openstack/). For help, contact the Internationalization team (https://wiki.openstack.org/wiki/I18nTeam)
OpenStack is one of the few projects you will work on that loves to hear your angry rants when something does not work. If you notice something off, consider filing a bug report with details of your environment and what you experienced: http://docs.openstack.org/trunk/openstack-ops/content/upstream_openstack.html - report bugs
Http://ask.openstack.org is a StackOverflow-style board for questions about OpenStack. Feel free to use it to ask yours, and if you have the ability - stick around and try to answer someone else’s. Or at least vote on the ones that look good.
Think OpenStack is pretty cool? Help us out by telling your friends; we'd really appreciate it. You can find some materials to help at http://openstack.org/marketing or join the marketing mailing list to find out some cool events to attend.
Join your local user group https://wiki.openstack.org/wiki/OpenStackUserGroups - they're in about 50 countries so far. Attend to learn, or volunteer to speak - we'd love to have you.
Navigating the Ecosystem: Where does your OpenStack Journey begin?
One of the unique aspects about OpenStack as an Open Source project is that there are many different levels you can begin to engage with it - you don't have to do everything yourself.
Starting with Public Clouds - you don't even need to have an OpenStack installation to start using it. You can today swipe your credit card at eNovance, HP, Rackspace and others and just start migrating your applications.
Though, of course, for many the enticing part of OpenStack is to build their own private cloud, and there are several ways to do that. Perhaps the simplest of all is an appliance-style solution. You purchase a thing, unbox it, plug in the power and the network and it just is an OpenStack Cloud.
However, hardware choice is important for many applications, so if that applies to you - consider that there are several software distributions available. You can of course get enterprise-supported OpenStack from Canonical, Red Hat and SUSE, but take a look also at some of the specialized distributions, such as those from Rackspace, Piston, SwiftStack or Cloudscaling.
If you want someone to help guide you through the decisions from the hardware up to your applications, perhaps adding in a few features or integrating components along the way, consider contacting one of the system integrators with OpenStack experience like Mirantis or Metacloud.
Alternately, if your preference is to build your own OpenStack expertise internally, a good way to kick start that might be to attend or arrange a training session. The OpenStack Foundation recently launched a Training Marketplace (http://www.openstack.org/marketplace/training), where you can look for events nearby. There's also a community training effort (https://wiki.openstack.org/wiki/Training-manuals) underway to produce open source content for training.
To derive the most from the flexibility of the OpenStack framework, you may elect to perform a 'DIY' solution. In which case, we strongly recommend getting a copy of the OpenStack Operations Guide (http://docs.openstack.org/ops), which discusses many of the decisions you will face along the way. There's also a new OpenStack Security guide (http://docs.openstack.org/sec/) that is an invaluable reference for hardening your installation.
DIYing your OpenStack Cloud
If after careful analysis, you've decided to construct OpenStack yourself from the ground up, there are a number of areas to consider.
One of the most fundamental underpinnings of a cloud platform is the storage on which it runs.
In general, when you select storage back-ends, ask the following questions:
- Do my users need block storage?
Do my users need object storage?
Do I need to support live migration?
Should my persistent storage drives be contained in my compute nodes, or should I use external storage?
What is the platter count I can achieve? Do more spindles result in better I/O despite network access?
Which one results in the best cost-performance scenario I'm aiming for?
How do I manage the storage operationally?
How redundant and distributed is the storage? What happens if a storage node fails? To what extent can it mitigate my data-loss disaster scenarios?
Which plugin to I use for block storage?
For many new clouds, the object storage and persistent/block storage are great features that users want. However, with OpenStack you're not forced to use either if you want a simpler deployment.
Many parts of OpenStack are pluggable, and one of the best examples of this is Block Storage - which you are able to configure to use storage from a long list of vendors (Coraid, EMC, GlusterFS, Hitachi, HP, IBM, LVM, NetApp, Nexenta, NFS, RBD, Scality, SolidFire, Windows Server, Zadara).
If this is the first time you are deploying a cloud infrastructure in your organization, after reading this section, your first conversations should be with your networking team. Network usage in a running cloud is vastly different from traditional network deployments, and has the potential to be disruptive at both a connectivity and a policy level.
For example, you must plan the number of IP addresses that you need for both your guest instances as well as management infrastructure. Additionally, you must research and discuss cloud network connectivity through proxy servers and firewalls.
One of the first choices you need to make is between the "legacy" nova-network and OpenStack Networking (aka "neutron"). Nova-network is a much simpler way to deploy network, but does not have the full software defined networking features of Neutron, and will be deprecated after 12-18 months.
Object Storage's network patterns might seem unfamiliar at first. Consider these main traffic flows:
- Among object, container, and account servers
- Between those servers and the proxies
- Between the proxies and your users
Object Storage is very 'chatty' among servers hosting data - even a small cluster does megabytes/second of traffic, which is predominantly "Do you have the object?"/"Yes I have the object" Of course, if the answer to the aforementioned question is negative or times out, replication of the object begins.
Consider the scenario where an entire server fails, and 24 TB of data needs to be transferred "immediately" to remain at three copies - this can put significant load on the network.
Another oft forgotten fact is that when a new file is being uploaded, the proxy server must write out as many streams as there are replicas - giving a multiple of network traffic. For a 3-replica cluster, 10Gbps in means 30Gbps out. Combining this with the previous high bandwidth demands of replication is what results in the recommendation that your private network is of significantly higher bandwidth than your public need be. Oh, and OpenStack Object Storage communicates internally with unencrypted, unauthenticated sync for performance - you do want the private network to be private.
The remaining point on bandwidth is the public facing portion. Swift-proxy is stateless, which means that you can easily add more and use http load-balancing methods to share bandwidth and availability between them.
More proxies means more bandwidth, if your storage can keep up.
To achieve maximum scalability via a shared-nothing/distributed-everything architecture, OpenStack does not have the concept of a "cloud controller". Indeed, one of the biggest decisions deployers face is exactly how to segregate out all of the "central" services - such as the API endpoints, schedulers, database servers and the message queue.
For best results, acquiring some metrics on how the cloud will be used is necessary - though of course, with a proper automated configuration management system it will be possible to scale as operational experience is gained. Key answers to look for may include:
- How many instances will run at once?
- How many compute nodes will run at once?
- How many users will access the API?
- How many users will access the dashboard?
- How many nova-API services do you run at once for your cloud?
- How long does a single instance run?
- Does your authentication system also verify externally?
Where choosing the size of compute node hardware is mainly dependent on the types of virtual machines running, "central service" machines can be more difficult. Contrast two clouds running 1,000 virtual machines. One is mainly used for long-running websites, and in the other the average lifetime is more akin to an hour. With so much churn in the latter, it will certainly need a more heavyset API/database/message queue.
Given "scalability" is a key word in OpenStack's mission, it's no surprise that there are several methods dedicating to assisting the expansion of your cloud by segregating it - in addition to the natural horizontal scaling of all components.
The first two are aimed at very large - multi-site - deployments. Compute cells are designed to allow running the cloud in a distributed fashion without having to use more complicated technologies, or being invasive to existing nova installations. Hosts in a cloud are partitioned into groups called cells. Cells are configured in a tree. The top-level cell ("API cell") has a host that runs the API service, but no hypervisors. Each child cell runs all of the other typical services found in a regular installation, except for the API service. Each cell has its own message queue and database service, and also runs the cell service — which manages the communication between the API cell and child cells.
This allows for a single API server being used to control access to multiple cloud installations. Introducing a second level of scheduling (the cell selection), in addition to the regular nova-scheduler selection of hosts, provides greater flexibility to control where virtual machines are run.
Contrast this with regions. Regions have a separate API endpoint per installation, allowing for a more discrete separation. Users wishing to run instances across sites have to explicitly select a region. However, the additional complexity of a running a new service is not required.
Alternately, you can use availability zones, host aggregates, or both to partition a compute deployment.
Availability zones enable you to arrange OpenStack Compute hosts into logical groups, and provides a form of physical isolation and redundancy from other availability zones, such as by using separate power supply or network equipment.
You define the availability zone in which a specified Compute host resides locally on each server. An availability zone is commonly used to identify a set of servers that have a common attribute. For instance, if some of the racks in your data center are on a separate power source, you can put servers in those racks in their own availability zone. Availability zones can also help separate different classes of hardware.
When users provision resources, they can specify from which availability zone they would like their instance to be built. This allows cloud consumers to ensure that their application resources are spread across disparate machines to achieve high availability in the event of hardware failure.
Host aggregates, on the other hand, enable you to partition OpenStack Compute deployments into logical groups for load balancing and instance distribution. You can use host aggregates to further partition an availability zone. For example, you might use host aggregates to partition an availability zone into groups of hosts that either share common resources, such as storage and network, or have a special property, such as trusted computing hardware.
A common use of host aggregates is to provide information for use with the compute scheduler. For example, you might use a host aggregate to group a set of hosts that share specific images.
Another very useful feature for scaling is Object Storage Global Clusters. This adds the concept of a Region, and allows scenarios such as having one of the three replicas in an off-site location. Check the SwiftStack blog for more http://swiftstack.com/blog/2012/09/16/globally-distributed-openstack-swift-cluster/.
If you've looked at the storage options, determined which types of storage and how they will be implemented, planned the network carefully (taking into account the different ways to deploy it and how it will be managed), acquired metrics to design your cloud controller, then considered how to scale your cluster, then you are probably now an OpenStack expert. In which case, we'd encourage you to share your findings with the community!