Thus far we’ve covered a lot of ground in the minutia of the varied component in Puppet. We’ve talked about what the Puppet platform is and some foundations upon which it is built. We’ve also talked about the Resource Abstraction Layer, Resources, and the Client Server model.
The thing is, I can list all the various components that get used to apply against a system, and the things that might occur when that happens, but still there are no ties between these base components.
How do I apply resources against a system? How do I only apply resources against a subset of systems? What are these artifacts called, and how do they interrelate?
These are only a few of the questions we’ve generated in this series.
The Manifest
Puppet has chosen the idiom of a “manifest” to be the core component which contains code that you would apply against a given system. Why was this term chosen? Let’s look at the definition of the word:
man·i·fest
/ˈmanəˌfest/
noun
noun: manifest; plural noun: manifests
1. a document giving comprehensive details of a ship and its cargo and other contents, passengers, and crew for the use of customs officers.
So, in this idiomatic reference, we would see a manifest as a list of things. That’s fine, but the disconnect, in my opinion, in the documentation is how to collect those manifests together, and how to get them onto a destination system. The data is there, but it’s spread across the documentation by categories, and it’s not readily clear how they relate. So, let’s break it down from the simplest component to the largest.
The Class
In the world of development, of whatever type, there are certain components we refer to known as constructs. As redundant as it seems, the dictionary tells us that to con-struct‘ something is to build or form it by putting together parts but a con‘-struct” is also something that has been constructed. For instance, “The carpenter constructed a dresser.” This is the first meaning, to build something from parts. “The carpenter’s new dresser was a beautiful construct.”
For our purposes, the aforementioned manifest would be considered a construct in that it is something built of or formed by putting together “parts”, which in our case would be resources and other code elements. Why am I talking about “constructs” and “code elements” under the heading of “Class”?
I need you to understand the components that make up a class before we talk about classes, which is now. Here is Puppet’s explanation:
In the Puppet paradigm, we talk of “classes”. Classes in Puppet are simply “Named blocks of Puppet Code”. 1 The above video goes into more depth and at a much more rapid rate than I am currently, but I wanted you to hear their explanation.
I, however, will break down the manifest and the contained class which is this “named block of Puppet code”. FIrst, the file that contains the named block of code (the Manifest) and then the code itself. In Puppet parlance, this is the “class” designation, and it looks like so:
Now clearly this is a simplistic example, but it is an example all the same. As you can see, there’s a place in the middle there where other code would go…as much as you like! And there is the important line that names the class “foo”. Why is this important?
In coding, we have many thousands of lines of code. Some are collections of functions and some are collections of procedures. Perhaps we want to reference or include code from one bit of our codebase…maybe we did something super-cool. You can either copy/paste that code verbatim, or you can use a little trick where you just refer to that code. By naming our class, we can refer to the code by name rather than writing it or using the same methods all over again.
It is important to note here that if you have a computer science background, you may have come across the term “class” in some languages or object-oriented programming in general. THIS IS NOT THAT. In fact, it would likely be in your best interest to not try and relate Puppet classes to anything you’ve used in the past. Are there similarities? You bet. Can you apply the same rules and ideas around Puppet classes as other paradigms? Not at all. In short, forget everything you know about classes. 🙂
Manifests (again)
Puppet tells us that resources are declared in manifests. They also tell us that manifests are “Puppet language files that describe how the resources must be configured”2 and that “manifests are a basic building block of Puppet and are kept in a specific file structure known as a module.” “Module” is a bit far along the learning curve right now, and not something I want to hit just yet. Just settle that we will cover modules and all you should think of a module and what it is right now is that it’s a virtual bucket that holds your code whether they be manifests, scripts, tasks or plans, templates, and more. We will get to all of it.
Tying everything together, we see that resources are individual elements describing the desired state for some aspect of a system. Resources are contained in classes which are a named blocks of Puppet code, and those classes live inside of files on disk known as manifests.
“That’s great, Questy, but how does this help me?”
I’m glad you asked.
Now we have several components that assist us in developing code that will configure our systems. The most important of which is this idea of a class that has a name that you can refer to from anywhere. This entire post (and some of the last one) has been targeting this specific idea.
Next time, we will talk about Desired State Configuration again, but we will add a new concept known as idempotency. This core idea is the power and elegance of Puppet in a nutshell, and we’ll talk about that. Also, we will begin to look at a module and what it is as well as how it works and is used in Puppet.
Welcome back to the Puppet Primer series. As I mentioned in the first installment of this series, we were to cover resources in depth later, and this is that article.
In the context of Puppet, when we say “resources”, we are referring to the fundamental unit we use in modeling system configurations. Puppet says it this way:
Resources are the fundamental unit for modeling system configurations. Each resource describes the desired state for some aspect of a system, like a specific service or package. When Puppet applies a catalog to the target system, it manages every resource in the catalog, ensuring the actual state matches the desired state.1.
But herein there is a disconnect. One might read this and say to themselves… “so what? How does this help me configure this fleet of computers here?” Here is where some amount of remediation and ordering of our thoughts is appropriate.
Infrastructure as Code
The buzzword above is pervasive today. Whether we’re talking about cloud computing resources, local VMs, or even containers and groups of containers, the buzzword “Infrastructure as Code” or “IaC” has been overused and is as loaded as the term “DevOps”. So first, let’s break down IaC. In short:
Infrastructure as code (IaC) is the process of managing and provisioning computer data centerresources through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.[1] The IT infrastructure managed by this process comprises both physical equipment, such as bare-metal servers, as well as virtual machines, and associated configuration resources.2.
To restate, in the old days we connected to individual systems one at a time to perform a configuration action against that machine. It may be to setup a web server, edit a DNS Zone file, configure a mail server, or even just modify content, but it was a very manual and very laborious process prone to error and misconfiguration, differences between systems, known as “drift”, resulting in improper or incomplete coverage of the environment.
Infrastructure as Code lends us the opportunity to think about our systems in a different way, as code elements rather than physical machines with installed software. One might then consider “All that is great and I can learn how to do that, but what provides me the interface or interaction with the operating system and hardware to see the effective change that I want?”
Your systems need some sort of interface between code and configuration, an interface to the system that can occur programmatically, and a method for applying that code-based work to the operating systems and the underlying hardware. We refer to the language as a “Domain Specific Language”. A Domain Specific Language simply means you have a language construct that is designed to manage one group of things in a particular domain or “grouping of things”.
A domain specific language such as the Puppet language is a “desired state configuration” language or a DSC. Puppet describes the language here:
The “secret sauce” of Puppet is the layer that translates your commands into Operating System configurations. This is known as a “resource abstraction layer” we spoke about in Puppet Primer I.
Puppet has taken the time to write code that recognizes and can configure many different Linux/UNIX platforms as well as MacOS and Windows. Consider the files on a Puppet system that can be found in the following location:
In this location, there are hundreds of files, functions, and methods that are designed by Puppet to address the various components of the system they are running on. For instance, in this directory there is a “provider” subdirectory that contains many components of the Puppet language you may be familiar with:
exec file group package service user
What Puppet has done here is created code that recognizes the platform you are running on, and accommodates whichever platform it is to perform an action. For example, the “user” directory here has several files:
All of these files accommodate the various circumstances one might encounter when trying to add a user. Let’s look at the “useradd” file. In particular, let’s compare components that add the user in various platforms. Here is a small part of the useradd.rb file:
if value == :absent
if Puppet.runtime[:facter].value('os.name') == 'SLES' && Puppet.runtime[:facter].value('os.release.major') == "11"
-1
else
''
end
else
case Puppet.runtime[:facter].value('os.name')
when 'Solaris'
# Solaris uses %m/%d/%Y for useradd/usermod
expiry_year, expiry_month, expiry_day = value.split('-')
[expiry_month, expiry_day, expiry_year].join('/')
else
value
end
end
},
As you can see, code has been written here for “useradd” to accommodate SLES and Solaris. There are many more examples where code accommodates special functions features, information, methods, command line switches and much more. When working with customers, I often will bring there here and walk through this particular portion of Puppet even though you never interact with this code or change it in any way. It gives sysadmins and developers a peek under the covers as to how resources are managed, and how (with Ruby) they are “separated” from the underlying hardware and operating system with accommodating code.
Commonly Used Resources
For our purposes here, we will deal with a small subset of resources, namely: “file”, “user”, “exec”, “service”, and “package”. We will also consider “catalogs” and “facts” here since they all work hand-in-hand.
As noted above, we see that code has been written that directly interoperates with the OS, thus shielding you from the idiosyncrasies of the underlying platform. By learning the Puppet DSL instead of each platform upon which you need to work, you only need to learn one command to add a user rather than “useradd”, “adduser”, “smitty”, “sam”, and any number of other platform commands you might need to interact with. Consider the following:
user { 'bob':
ensure => 'present',
uid => '500',
gid => '500',
comment => 'Bob from Accounting - x321',
shell => '/bin/bash',
home => '/home/bob',
password => '1/A5wIWqKsPYo',
...
}
The rest of the attributes you can choose to configure the user “bob” can be found in the reference located here. The great thing about Puppet code is this above block can be applied to any system of any type that Puppet supports. One set of code applied to your entire fleet, whether Ubuntu, RedHat, Windows, Solaris, or AIX. It gives you the opportunity to learn one configuration language to do what you want to do, leaving the specifics of the underlying OS’s user management to Puppet.
In the preceding block, we refer to the “user” portion as the resource. The name “bob” on the same line is the namevar or “name” attribute, but you can also give your resources interesting and descriptive names. The section after “user”, then becomes the “resource title”, then set the namevar itself to the username like so:
user { 'Bob Williams - Accounting, Desk 123AX - Extension 321':
ensure => 'present',
name => 'bob',
uid => '500',
gid => '500',
comment => 'Bob Williams',
shell => '/bin/bash',
home => '/home/bob',
password => '1/A5wIWqKsPYo',
...
}
The options are limitless, and are defined by you to be whatever you want.
Resources across the gamut in Puppet follow the same general rules. You can browse the available resource types and links to all the options supported by each one here.
Manifests
The next most common term you’ll hear in Puppet-world is this concept of a “manifest“. Puppet has adopted the idiom of resources collected together into manifests, manifests collected together into modules (along with other components), modules collected together in profiles and profiles collected together into roles.
This sounds like a lot to consume, but in actuality, except for manifests and modules, all of the above are logical organizational units known as a design pattern, which is simply a defined way to organize and arrange code to be used in an environment. We will go into detail on manifests in the next installment.
To get to the core of where we are heading with the entire idea of Puppet, and the larger concepts involved, we need to step back to the core of what we’re dealing with. When discussing servers, groups of servers, and their configuration, we first have to conceptualize the greater construct when we say “systems”.
When we say “systems”, we are including any computer operating system running on any server connected to an internetwork either locally on your business premises, in a co-location facility. data center, or even system instances in a cloud services platform. This would include your home as well.
Clearly, since we are discussing a configuration management platform such as Puppet, we sort of already knew that, but it bears repeating.
How Do We Communicate With a System?
It is a given that computers that are “internetworked” together are comprised of an Operating system that contains a network stack that allows the system to send data over “the wire” (an Ethernet network) to another system on the same or a remotely connected network. This is achieved via a network communication protocol known as TCP/IP which is known as the “Internet Protocol Suite“.
NOTE: The above linked topics in the text of the article are considerably outside the scope of this article, but are of PARAMOUNT importance to the functioning of Puppet. It is assumed the reader is knowledgeable about the core infrastructure tools and protocols outlined here, but the links are provided at your convenience.
To communicate between systems, many different methods can be employed. For instance, one can utilize a web browser to access Puppet.com by simply typing in the browser address bar https://puppet.com. As anyone who has gone through a technical interview of sufficient thoroughness would attest, to describe the entirety of the digital “conversation” that ensues would be prohibitive due to space. Just know that the web browser first looks up where to find that machine or machines that represent Puppet.com, it traverses the Internet and retrieves the page, displaying it in your web browser.
Puppet functions in a much similar way.
The Client/Server Model
We’ve talked about systems theoretically and in actuality. Now, we have to discuss communications between these systems beyond just their “connectedness” across a wire.
Once machines are connected one to another, there are a series of networking protocols that communicate largely unseen. Your connection to the Internet, if it is up and working, “just works”™. You can go to YouTube, read a website, and even watch television. But in the case of applications like Puppet, there is a greater communications relationship between systems. This relationship is generally called “protocol”.
When we speak of “protocol” in non-digital scenarios, we might mean something like the following:
the official procedure or system of rules governing affairs of state or diplomatic occasions.”protocol forbids the prince from making any public statement in his defense”
2.the original draft of a diplomatic document, especially of the terms of a treaty agreed to in conference and signed by the parties.”signatories to the Montreal Protocol”
As you can see, not a whole lot of computer wrangling to be found in this definition. But the same guard rails exist. Namely: “The official procedure or system of rules governing…”. In terms of computer systems, this is an example of an official procedure or system of rules governing communications between systems. How long is the connection to be open? Who starts the conversation and who sets the parameters for that conversation including when it ends? The “protocol”, then, denotes those “rules of engagement”, if you will.
The promise of configuration management in general and Puppet in particular is this: That one admin can systematize the way she manages infrastructure in such a way that can be broadly applied across systems or groups of systems programmatically rather than one system at a time. So, instead of configuring, say, SSH on a single system then connecting to the next system and doing that for all affected systems in your environment, this can take precious time you likely do not have. Instead, but referring to systems and services on those systems in groups, it is much more efficient to systematically configure SSH against an entire grouping of systems all at once.
In the Puppet world, this is achieved by the manner in which Puppet has “decided” upon protocol to both communicate and distribute and apply configurations to your environment.
The Puppet Conversation
There are many documents and explanations on the Internet around Puppet and how it works. What I will attempt to do is relate my explanation to as many specific Puppet documents as I can. I understand this presents a large volume of information to parse through, but this base operational function is paramount to understand if you wish to master the Puppet environment.
The Puppet ecosystem consists of many components, but we will begin with what we see in the image above. At its basest level, Puppet consists of the Puppet server system and any number of client systems that connect to this server to retrieve configuration elements that are available to apply to themselves. In many circumstances, when we discuss client/server communications, we speak of a server that contains the configuration and directs all the traffic in the ecosystem, “deciding” when and where things happen, and how they happen.
Some of these elements are true, but in the case of Puppet, the power to establish and conduct the conversation has been placed in the hands of the system being managed rather than dictated by the Puppet server. What does this mean?
The writers of Puppet have given the Puppet agent software which runs on the systems being managed control over when they request information from the server. By default, Puppet “wakes up” on the agent system every 30 minutes, and asks the server for its configuration. This, in effect, is a client-server conversation. When the client(agent) system “wakes up” to perform a run, here are the procedures it follows in its default configuration:
The client software which is running on a timer becomes active every 30 minutes
The agent software downloads the CA (Certification Authority) bundle from the server.
If certificate revocation is enabled, it also will download the Certificate Revocation List (CRL), utilizing the CA it just downloaded to verify the connection.
If there is a conflict in retrieving the certificate requiring some kind of resolution or remediation on the Puppet server (cleaning an old CSR or certificate) the agent will then “sleep” for the configured time period stored in the waitforcert configuration variable. (default: 2 minutes)
If the downloaded certificate fails verification, such as not matching its private key, the Puppet will discard the certificate. The agent will then sleep for the configured waitforcert period and repeats the process.
While this may seem like a lot, this is the primary conversation entered into between the agent and the server to ensure (by way of SSL) that each node is the node it represents itself to be, and that it is authorized to communicate with the Puppet server.
Once the server and the node connecting to the server are rather certain who each other is, the Puppet conversation can begin. Think of the above the portion where someone pulls you aside and says “let’s go somewhere we can talk privately”.
The next thing that happens is the agent node then requests it’s “node object” (more on this later) and to drop into the working “environment” it belongs in.
If the API call is successful, the agent then reads the environment from the node object. If the node object has an environment, use that environment instead of the one in the agent’s config file in all subsequent requests during this run.
If the API call is unsuccessful, or if the node object has no environment set, use the environment setting from the agent’s config file.
Since Puppet is an extensible platform, there are many added features and functions you can add to the Puppet environment. One of these is known as a “plugin”, and Puppet moves these plugins back and forth between the server and agent by a process known as pluginsync. In short, Pluginsync is a mechanism by which Puppet synchronizes its custom system profile information (known as “facts”) that are delivered via a platform component known as “facter“. We will cover facter more completely at a later time, just know that the facter “facts” contain a full profile of your system that the server needs to know about.
If pluginsync is enabled on the agent system, fetch plugins from a file server mountpoint that scans the lib/ directory of every Puppet module.
Request a “catalog” from the server while submitting the latest facts produced by facter.
Do a POST /puppet/v3/catalog/<NAME> where the post data is all of the node’s facts encoded as JSON while receiving a compiled catalog from the server in return.
Make file resource requests when applying the catalog:
File resources can specify file contents as either a content or source attribute. Content attributes go into the catalog, and the agent needs no additional data.
Source attributes put only references into the catalog, and may require additional HTTPS requests.
If you are using the default compiler, then for each file source, the agent makes a GET /puppet/v3/file_metadata/<SOMETHING> request and compares the metadata returned to the state of that file already existent on-disk.
If the file is in sync, the agent moves on to the next file resource.
If the file is out of sync, the agent does a GET /puppet/v3/file_content/<SOMETHING> to retrieve the content of the file that should be on the disk
If you are using the static compiler (a more efficient compiler) all file metadata is embedded in the catalog. For each file source, the agent compares the embedded metadata from the catalog to the file contents on disk.
If the file content is in sync, the agent moves on to the next file resource.
If the file is out of sync, it performs a GET /puppet/v3/file_bucket_file/md5/<CHECKSUM> for the content.
NOTE: Using a static compiler is more efficient with network traffic than using the normal (dynamic) compiler. Using the dynamic compiler is less efficient during catalog compilation. Large amounts of files, especially recursive directories, amplifies either issue.
Finally, if the agent’s configuration has “report” enabled on the agent node, the Puppet agent will then submit the report to the server by performing a PUT /puppet/v3/report/<NAME>
Comments and Contents
This may seem like overkill in the grand scheme of a Puppet primer but I guarantee you will definitely be thankful we “went there” this early in the process. When we start breaking down all the various components like facter, catalogs, resources, and the push-pull nature of the conversation between the Puppet server and agents, having a fundamental knowledge of the process that is occurring as well as having a reference to come back to will be invaluable.
I’d like to thank Puppet for their encyclopedic platform reference. Largely the entire page here came nearly directly from the documentation. I tried to link specific pages where I was either referring to or introducing some new concept or platform component we have not yet discussed. I am primarily referring to the Puppet reference on Agent-server HTTPS communications found here specifically as it relates to the agent-side checks and HTTPS requests made during a single Puppet run.
Idiosyncratically, Puppet will perform in specific ways and perform considerably more functions and details than what is related here, but the items here are documented and transparent whereas some of the other items are “private” functions and API calls, and are not intended for customer consumption.
One of the perennial problems with a platform like Puppet is not the lack of documentation in and of itself, but moreso the lack of various levels of documentation at all levels of documentation consumer. What I mean by this is, we may have a lot of documentation available (take the veritable encyclopedia of data at https://puppet.com/docs), but we may not actually have access to beginner-level accessible data.
What do I mean by “accessible”?
Most of the professional documentation I’ve encountered around the Puppet ecosystem is produced by brilliant engineers. Oftentimes, though, as we progress forward in our development on any platform, we tend to forget from where we came, and the level of knowledge we might have had at our earliest stages of development. As a result, we get really advanced documentation that is trying to be apprehended by the novice user.
This series will attempt to alleviate some of that. It should be noted that the majority of the work in this series of documentation will be on Puppet Community. Where possible, when documentation features or functions, I will link official documentation, Puppet Git projects, or some other foundation for the assertions and documentation that I am making.
What is Puppet?
First and foremost, we should answer the question “What is Puppet”. Now, generally, when we come to this question, many have already answered that question, and are likely looking here for advanced information. However, since this is a primer, we will cover that here.
When we talk about Puppet, we’re not talking about these guys:
What we’re talking about is an automation platform used by System Administrators and Engineers (primarily) to automate their work at scale. In the past, SysAdmins would keep a list of hosts they managed locally in text files or in SSH session management software like PuTTY. Through any number of mechanisms tht would allow them to automate procedures across n+1 nodes. In some cases 5-10 at a time, and in other cases hundreds or thousands. However, there were problems…
With hundreds or thousands of nodes, you could only “chunk” your actions into groupings between 20-40 at a time, and many times the commands you would execute would be performed in a serial fashion, taking time during which you would have an environment that was not fully in sync. Therefore, people could conceivably reach resources that have had changes, refresh a browser or a thin client, and get an entirely different experience, feature set, or even data. This was no good.
Additional problems would be that as your fleet grew or shrunk, the lists of nodes you have may become out of sync, and if you had multiple engineers, your lists may be out of sync with each other, causing coverage issues when trying to make changes to your environment. Maybe some engineers were using one set of scripts or node lists, and another was using different scripts or node lists, and perhaps the functionality between them differed. This made for a lack of predictability in how an environment was configured and/or was functioning, and would conceal states of “drift” from node to node.
Enter Puppet
Puppet’s creator, Luke Kanies, found this disarray noted above in the System Administration space, and decided to create Puppet. You can learn a bit more about the early days and development of Puppet from an O’Reilly interview with Luke here:
Luke’s main development that sort of revolutionized the configuration paradigm was the development of a “Resource Abstraction Layer”, which he describes in the video. For a little more in-depth coverage of the RAL and how it works, check out these articles here:
In short, the RAL is the “thing about the thing”. In these days of “meta-everything”, it seems odd to use such a reference, but the referencing system works.
When approaching a system for system administration purposes, there are files, packages, users, configurations, text, binaries, repositories… many different things you have to be aware of. As a result, you build up a skillset consisting of a knowledge of not only what these things are, but how they work, are configured, and interact with other subsystems. Luke’s main development here was to build this RAL as a “modeling system” to approach a server or series of servers programmatically.
As a result, a system was broken down into components known as “resources”, which are the fundamental unit for modeling a system configuration in Puppet.
Puppet has simplified not just the configuration of systems in an IT infrastructure, but made it possible to assert a configuration against many systems at once in an infrastructure, but this isn’t the main power behind Puppet. The main power is to be able to collect various resources into groupings, and to apply those configurations programmatically, allowing you to work with code as your infrastructure rather than individual machine configurations.
How Does it Work?
When you use Puppet, you define the desired state of the system in code. This code that you use is a Domain Specific Language (the Puppet DSL) which you use against a wide array of operating systems and devices, defining the desired state of those systems, not how to get there. Puppet, by utilizing its RAL, a Puppet Server and Agent which interprets the code you’ve written and configures the destination machine with that code. The organization of that code flow and system configuration looks like so:
Altogether, this suite of tools, coding, methods, and components makes up the Puppet Platform. In our next installment, we will break apart this platform into its separate components and see what’s under the hood.
In my Puppet travels over the last 10 or so years, one topic has continued to arise time and again, and that has been the ability to scale Puppet Community (formerly Puppet Open Source) to thousands of nodes.
While the best route to go is to use Puppet Enterprise for solid support and a team of talented engineers to help you in your configuration management journey, sometimes the right solution for your needs is to use Puppet Community. What follows is the product of my resolving to get to the bottom of the procedure and make it easy to perform repeatedly and to assist in scaling Puppet Community implementations for larger environments.
Even though this article presents a somewhat rudimentary configuration, you can add PuppetDB, external instrumentation and telemetry, etc. and grow the environment to a truly large enterprise-class system.
The Design
The design of such an environment is highly flexible with many hundreds of potential configurations. In this specific scenario, an independent Master performing CA duties for the architecture as well as several catalog compilers placed behind a TCP Load Balancer for catalog compilation is what I plan to cover since once the specific moving parts are identified, modern System Engineering practice can be applied to the environment to expand and scale the installation as needed.
Architecture
The Puppet Master
Every Puppet implementation has one of these. Whether there are extra compilers or not, the primary master is tooled to be not only a CA Master but also a catalog compiler in its own right. If you begin to tune the master and place it on beefy hardware, You can expect to eventually reach a limit to the number of nodes you can serve. If you add PuppetDB to the mix, there’ll be different requirements, but generally speaking, you will want to offload PuppetDB to a different server so as to keep the master free to serve CA requests to the environment.
The Load Balancer
For this function, you simply need a TCP Load Balancer. An Elastic Load Balancer in AWS would serve nicely as would HAProxy on a large-ish machine (t2-xlarge). In short, this load balancer simply needs to be able to see that a port is up on the destination nodes in the serving pool, and then proxy the connections to a member of that pool that is in a healthy state. You may also wish to pull the Puppet healthcheck API endpoint:
At the Load Balancer to ensure the pool is healthy, and the load balancer only forwards requests to healthy catalog compilers.
Note also that for the purposes of this discussion, it is assumed you have set up this load balancer and assigned the name compile.example.com to the VIP IP (10.0.100.20 below).
The Compilers
These are simply Puppet Server installations that have had the CA utility turned off, and you have configured client nodes to look to the master for this information. These nodes will sit behind the load balancer and take catalog requests from Puppet agents as though they were the only Puppet server and perform standard Puppet server requests (minus the CA work).
The Platform
My practice and work in this implementation was done in AWS. You can do the same work in Digital Ocean, Linode, or on physical hardware. The important part is not the size or location of the nodes I’ve used, but the configuration I will enumerate below. As long as the configuration is maintained, results should be relatively consistent from platform to platform.
The Procedure
I performed this installation several times as though the setup did not have DNS resolution. By this, I mean that I did all name resolution in host files. You can easily manage these in Route 53 or you can add “A Records” in your own DNS serving infrastructure. The process I outline here is the former, using the host files.
First, accommodate the names of your systems by laying out what the names will be and the structure of the addresses as needed. In the case of this reference implementation, the /etc/hosts file is as follows:
For each node I provision, I immediately configure the /etc/hosts file to contain this information so all nodes can reach each other by name. This is to satisfy the stated requirements of Puppet itself that name resolution/DNS needs to be configured and functioning.
Next, we need to install Puppetserver on the master. This is straightforward as mentioned in the Puppet docs here: https://puppet.com/docs/puppet/latest/puppet_platform.html#task-383
So, on RHEL 7, you would enable the Puppet Platform repo
Then you would install the PuppetServer package itself:
yum install puppetserver
or
apt-get install puppetserver
Be sure to source the newly installed profile so the following commands can be found in your path. To do so, run:
source /etc/profile.d/puppet-agent.sh
before continuing to have all puppet resources needed in your path.
At this point, we want to configure Puppet Server before allowing it to start to allow for alternate DNS names when signing certificate requests, which accommodates the name on the load balancer VIP as well as the individual compiler node names when you begin standing up catalog compilers. To do this, we need to edit the file /etc/puppetlabs/puppetserver/conf.d/ca.conf. In the file’s documentation it enumerates the new line we need:
allow-subject-alt-names: true
The subject-alt-name is an X.509 extension that allows various values to be associated with a certificate. In this way, we’re leveraging the extension to allow all cert signing by the CA Master to allow for and associate all names of the VIP and the compilers to be accepted by the master and to be acceptable to the connecting node.
The final step before starting the puppetserver is to generate a root and intermediate signing CA for the puppetserver, as it will be terminating SSL requests for the architecture. To do this, simply run:
puppetserver ca setup
Once you have added the above line and set up the CA, it is time to start the server. On both platforms, you would run the following SystemD control commands:
When puppetserver starts, it will begin behaving as a CA master, capable of both terminating SSL and compiling catalogs. The Puppet documentation for that file is obliquely referenced here and weighs heavily in this configuration.
At this point, the Puppet Server is running and is accepting requests for signing.
Your First Compiler
Next, we need to install a compiler, but we need to make sure that compiler will accept catalog compile requests but not provide CA services at all.
This server needs to know about itself and its own job, where the CA Master (Puppet Master) is, and what names it has and is responsible for. First, install Puppetserver on the compiler (in our example, compiler1.example.com). As soon as the PuppetServer is installed, but before it is started, you need to configure the following to represent the compiler you are configuring:
Edit the /etc/puppetlabs/puppet/puppet.conf and create a “main” section as follows:
In this way, you’re specifying all names that particular compiler is authorized to “answer” for, namely its own certname, it’s own hostname, the load balancer’s certname and its hostname portion of the certname as well.
Next, you need to tell the compiler that it has specific certs. Above, you’ve already told it where it’s Puppet Master is (server=puppet.example.com), You’ve also told it what its own names are (compiler1.example.com,compiler1,compile.example.com,compile) which are its own host names and the host names of the VIP on the Load Balancer. You also need to tell the Puppet server on the compiler the values necessary for it to configure Jetty. Edit the /etc/puppetlabs/puppetserver/conf.d/webserver.confand add these lines to the end of the top section:
Finally, you have to disable the local CA service on the compiler itself. This is accomplished by editing the file /etc/puppetlabs/puppetserver/services.d/ca.cfg. There are two lines that need to be commented/uncommented:
The distributed version of the file has the CA enabled:
# To enable the CA service, leave the following line uncommented
puppetlabs.services.ca.certificate-authority-service/certificate-authority-service
# To disable the CA service, comment out the above line and uncomment the line below
#puppetlabs.services.ca.certificate-authority-disabled-service/certificate-authority-disabled-service
puppetlabs.trapperkeeper.services.watcher.filesystem-watch-service/filesystem-watch-service
The in-line documentation tells you to comment the second line and to uncomment the 4th line to disable the CA service. Do that here so that the file looks like this:
# To enable the CA service, leave the following line uncommented
#puppetlabs.services.ca.certificate-authority-service/certificate-authority-service
# To disable the CA service, comment out the above line and uncomment the line below
puppetlabs.services.ca.certificate-authority-disabled-service/certificate-authority-disabled-service
puppetlabs.trapperkeeper.services.watcher.filesystem-watch-service/filesystem-watch-service
Once all these components are in place on your catalog compiler, you need to connect your catalog compiler to the master in the usual fashion. First, request that your local certificate be signed:
puppet ssl bootstrap --waitforceert 60
Then, on the Master, sign the certificate request by specifying the machine’s certname:
puppetserver ca sign --certname compiler1.example.com
When you sign the certificate request, the Puppet CA Master then receives all alternative names from the compiler, and signs all names the compiler is representing. Namely, compiler1.example.com, compiler1, compile.example.com, and compile. This allows an agent, when connecting to compile.example.com to interact with the VIP as the catalog compiler, and it will accept any of the names it sees in that communication. When the agent connects to compile.example.com and gets forwarded to, say, compiler42.example.com, it doesn’t blink because the signed cert is “acceptable” to the CA infrastructure you’re currently interacting with.
Once you have signed the catalog compiler’s certificate request, then return to the catalog compiler and perform a Puppet run:
puppet agent -t
Then, turn on the puppetserver daemon and set it to start at boot:
At this point, the master, the catalog compiler, and the load balancer are all up and running, functioning as designed. The final portion is to connect a Puppet agent to this infrastructure so it works as expected.
Connecting Puppet Agents
On any agent, you would install puppet as you would normally by first enabling the platform repo just as we did for the Puppet Servers:
On RHEL 7, you would enable the Puppet Platform repo as follows:
Once the platform repos are configured, then install the puppet agent as follows:
yum -y install puppet-agent
On Redhat family of servers or
apt-get install puppet-agent
On Ubuntu.
Finally, before executing your first Puppet run, you need to edit the puppet.conf to tell the node where its resources are. (You may wish to manage puppet.conf with Puppet in your fleet as the size grows.)
In your configuration, this will reflect your infrastructure:
server=<FQDN of the Load Balancer IP>
ca_server=<FQDN of the Master>
After you’ve edited the puppet.conf, bootstrap the SSL as you did above on the compilers:
puppet ssl bootstrap --waitforcert 60
Sign the certificate request on your Master:
puppetserver ca sign --certname <FQDN of the new agent node>
Finally, on the new Puppet agent, ensure a Puppet run completes without error:
On the agent node:
puppet agent -t
If everything has been performed correctly, the Puppet agent machine will request from the Master a certificate signing. You will manually (or via autosign.conf) sign the agent’s certificate request. the agent then begins the catalog upload procedure, but instead of sending that to the Master, you’ve specified it should send that to your load balancer VIP instead. The Load Balancer will forward the catalog to one of the compilers in the pool and a standard Puppet run will complete for the agent.
It is at this point you can follow the “adding a catalog compiler” procedure to scale your compile farm, or just continue to add agents to connect your fleet to the infrastructure.
Conclusion
If you’ve completed everything above, you should now have a scalable infrastructure for Puppet Community. The master is serving CA certificate signing and the load balancer is handing off requests to individual compilers for processing. Agents are configured in such a way as to send those certificate requests to the master and catalogs to the load balancer vip, allowing for a greater volume of requests.
It should be noted that no tuning is called out in this procedure, but Puppet has a great deal of interesting information that might allow you to increase capacity even more. The basic tuning guide for the Puppet Server can be found here:
This will give you guidelines for tuning your server related to the hardware specifications of the server, and assist you in scheduling runs, tuning parameters, and just generally ensuring the Puppet Server is operating optimally for your infrastructure needs.
Jerald Sheets is the Owner and CEO of S & S Consulting Group. He is a Puppet Certified Consultant with Norseman Defense Services, Performs consulting and training on the Puppet Enterprise and Community Platforms, and is a Puppet Certified Professional 2014 & 2018.
He can be reached via the following several methods:
Why tools alone won’t help you reach automation Nirvana.
DevOps consulting and implementation at the basest levels have been revolving around implementation and enablement for quite a number of years now. You reach out to a vendor, either buy a product or use an Open Source offering and have a consultant come in, setup the product, and train your staff on how to use it.
However when the consultant goes home, the hard work begins, and it is questionable just how much success your team will have since the majority of the experience they’ve had so far is just learning the syntax of this new automation language. They’ve not yet learned their stride.
Automation is good. There’s tons of products to automate with… Ansible, Puppet, Chef, SALT… there’s no lack of tools to use on your infrastructure. Perhaps you’re just instrumenting configuration… Perhaps more! Maybe you’re placing elements of an application stack, and maybe you’ve gotten to a level where you can deploy a new application in a mostly automated fashion. Many sites will stop there. They will proceed along automating redundant tasks and help to increase efficiency while reducing the error rate.
Indeed, this is the promise of automation, but in actuality your team is still thinking like operational personnel with a shiny new tool. “I used to do thing X in a segmented or serial fashion, but now I can let a tool do it the precise same way.”
This is not the promise of the automated infrastructure.
While you’ve increased efficiency and likely emboldened your team to do more with less, there’s one small component of this that breaks you free to crush your old methods, and that’s one of stepping into a paradigm shift of site and infrastructure management wherein you begin to think like a developer. Instead of seeing your environment as thousands of little components you are automating the configuration and operation of, now you have the power and the tools to see your entire infrastructure as major components of a larger operating whole.
Instead of thinking of your systems in components — ssh, httpd, named, etc., you can think of collected components expressed as the system configuration. That system configuration is an abstraction of the whole that makes up that system’s identity in your site. Seeing the site in abstractions that when collected together for higher-order abstractions, and even higher order abstractions make your infrastructure more into a collection of abstractions than individualistic configurations.
Raising your vision to one of modular components like building blocks allowed you to model your site on a much higher level, and to change the way you consider configuration, deployment, security, and even operations.
Stop looking at the tools and the individual operations of the tool on an atomic level, and consider how you may build larger abstractions to divide your site into modular segmentations and then grouping of those into larger building blocks that are then grouped together again to model very specific outcomes.
By moving to a more holistic view from above predicated on such an abstraction model puts power into your hands to speed buildouts, deploy quickly, and take your provisioning and automation to higher levels with the same tools you’ve always used.
Sounds crazy to those who know, but the world out there is only just starting to get the DevOps momentum (much less the DevSecOps, SecOps, et al. momentum) even though it’s already a 10 year old term!
The larger and more vertical sites are coming around to DevOps simply as a matter of inertia now, as the largest of the large have made a play, the developers the schools are turning out are ALL “DevOps Enabled”, and the market is dictating, nay demanding faster TTM and SDLC workflows.
To get your newest software to site faster, the entire lifecycle of systems and software throughout your pipeline has to be streamlined and interconnected. From provisioning to automation, continuous security to DevSecOps “shifting left” into the beginning of the SDLC process as well as automated software scanning (code quality AND security) and then as fully automated as possible deployment scenarios up to and including live to production, your business could be moving faster… MUCH faster.
Most sites have the talent and resources onsite to make a fully automated DevOps workflow a reality, they just need leadership and guidance to help shift their organization into a secure continuous delivery mechanism that follows the strong value “automate all the things!”
S & S can help you start this journey, implement all the components of your automation and SDLC strategy, help you integrate security into the process, and gain a high level of confidence through automated testing. Give us a call at 912-549-0272 and let us help you start your journey today!
We have a Puppet Enterprise Split Installation consisting of a Puppet Master, PuppetDB, and Puppet Enterprise Console.
We have a Load Balancer with two compiler nodes behind it.
We have an ActiveMQ Hub and an ActiveMQ Spoke and have removed ActiveMQ responsibilites from the Enterprise Master (MoM).
We have built a GitLab instance to host our Control Repo and other items necessary to operation of our Puppet environment.
The final remaining piece is like a “glue” step where we pull all the various pieces together, generate keys for SSH and deployment tokens. We also associate the ctalog compilers to the Enterprise master and coordinate the deployment of code across the masters. Needless to say, you will need to have already performed all the preceding steps, and have made everything ready to go for the following procedure. Failure to have done so will have unpredictable results. So, if you’re ready, let’s proceed.
Setup A Control Repository
The first and foremost piece is to have a repo whose job is to “control” the processig of modules and custom code, and giving the “map” for deployment into your Enterprise master and catalog compilers. Puppet Labs has a suggested sample one here which has quite a number of nice features. However, when I first wrote this tutorial, it was considerably overkill for what I was needing to do, so I opted to create my own very simple version of a control repository. Quite a few iterations have occurred since writing these instructions, so I will continue with my instructions.
The Control Repo
The control repo came about as a collaboration at Puppet Labs between employees, consultants, users, etc. It was originally named something else which escapes me at the moment, but eventually came to be named the “control repo” by virtue of the service it performs. In short, it contains the “map” between what you have in Git or at the Puppet Forge, and the deployment directories on your Puppet masters. The “map” itself is known as the “Puppetfile”. This file contains a listing of all the modules you want deployed to the server. The bonus is that for each Git branch you have within the repo, this specifies an “environment” to Puppet.
I won’t get into all the conversation around whether you should have 1:1 mapping between Git branches and Puppet Environments and then from Puppet Environments to application tiers… I’ll leave that to the Puppet folks. I have always mapped everthing identically all the way through, and will cover that process here.
NOTE: The understood way of doing this these days is that if you have a new “thing” you want to create, you fork a feature branch, apply it to a few nodes for testing, then merge back into your master or production branch and deploy everywhere. I’d recommend learning this. In my own needs, I had a lot of governed environments (PCI, SOX, ITIL, etc) that needed absolute code separation, and the ability to demonstrate that no code in one environment had a chance of deploying to another. (principle of environment separation) As a result, I always opted for 1:1 correlation.
This is a bare repo with a collection of Forge modules populated into the Puppetfile with a “development” and a “production” branch. This will trigger Puppet to create directories called “development” and “production” in /etc/puppetlabs/code/environments, and will contain the items you instruct it to deploy there from the Puppetfile.
This will present you an issue in that you can’t push to my repo. What I always tell people to do is to move it to their own Git repo. This is well documented elsewhere, but I’ll give you an example process.
While in the control_repo directory, perform the following:
git checkout development
git remote rm origin
This now makes sure you’re in the development branch, and that the repo is unattached to my GitHub account. Next, you’ll need to create a control_repo in your Git server, and set it as your own remote. First login to your Git server and create an empty repo to hold the code. Next, in the repo you forked from mine, run the commands to switch repos like so:
git remote add origin https://<YOUR_GIT_SERVER>/<YOUR_ID>/control_repo.git
git add .
git commit -a -m 'Initial Commit'
git push origin development
git checkout production
git add .
git commit -a -m 'Initial Commit'
git push origin production
Now, you have the repo local to you and pointing to your own Git repository so you can edit and update the control_repo at will. (you might note extra steps there. This is for those unfamiliar with Git. They may not be necessary, but it gives a good pattern for how to work with repos, and I want to establish good habits early.)
Generate SSH Keys
On the Enterprise Master, you will need to create locations and/or set permissions on files and directories used by Code Manager:
Next, you will need to generate a secret key to use with Code Manager setup. To create the secret key, perform the following on the Enterprise Master:
ssh-keygen -t rsa -b 4096 -C "SSH Deploy Keys"
When ssh-keygen asks you what to name the key, I usually give it a name I can remember, name it after the customer I am doing work for, or just name it for the Control Repo itself. In this case, let’s answer with the latter:
Next, you’ll need a Code Manager user in the Enterprise Console. use the following process for that:
Create a new role named “Deploy Environments”
Assign this role the following permissions:
Add the “Puppet Environment” type.
Set Permissions for this type to “Deploy Code“
Set the Object for this type to All.
Add the Tokens Type
Set Permissions for this type to Override Default Expiry.
Create a local user to manage code deployments.
Click “Access Control | users“
On the Users page, in the Full Name field, type the User’s Full Name: (e.g. CM Admin)
In the Login field, type the name cmadmin.
Click Add Local User.
Set the User’s password to “puppetlabs” (or whatever you’d like to use)
Select the user from the list.
Click “Generate password reset“
Retrieve the link in a browser, and set the password to “puppetlabs“.
Finally, add the user to the “Deploy Code” role.
Click the “Deploy Environments” role.
Click the “Member Users” tab.
Fromt the dropdown list in the User Name field, select the CM Admin user and click Add User.
Code Manager
Under the covers, Puppet Labs now uses r10k with the control repository to manage the deployment of code.nUnder this scenario, a few items are very important to remember:
Under no circumstances should you be manually editing code in /etc/puppetlabs/code any more. Any attempt to do so will be overwritten by the code manager. ALL deployments to the system must come through your editing the control_repo and pointing to either Forge modules or custom modules you have written to be deployed to your Enterprise Master (and sync’ed to the catalog compilers).
You must have a control repo branch for each environment you wish to represent in your Masters (production, testing, etc.)
You cannot shorten or live without the fully named “production” environment. Puppet hard-coded this environment name in the product, and shortening the name to “prd”, “prod”, etc. will not work.
Code Manager operates with a synchronization subdirectory that lives in /etc/puppetlabs/code-staging. When you’re pushing coe via your control_repo, it goes here first, then Code Manager and Code Sync take over, and publish the code to all compile masters at once. Once all masters have the code in code-staging, it gets copied to /etc/puppetlabs/code.
You should have a custom deployment user explicitly for pushing code into your master. I have settled on using “cmadmin” as a deploy user on the Git Server. This allows you to have a generic user on the GitLab server you created earlier that you can work with, configure web hooks for, and then leave the credentials for that user with your customer or place it into IDM for your company.
To setup the new user:
Create a user in the admin area of the GitLab server named “cmadmin“. Next, select “Edit” in the upper right hand corner of the screen and set the password as you see fit. (I’ll use “puppetlabs“)
Select “Impersonate” from the upper right hand section of the page to assume the identity of the “cmadmin” user.
Select “New Project“.
On the resulting page, create a new repo called “control_repo” and make it a piblic project.
Click “Create Project“.
Push the control repo from the previous section to this repo in the cmadmin space.
Seeing as we are using GitLab, you are unable to use a full authenticated deploy token because GitLab server’s input buffer is too short to handle a full authentication token. NOTE: This has changed in later versions of GitLab. You may find success in just creating the token.
Configure the Webhook:
Connect to your Git server (e.g. http://git.example.com) and choose the “settings gear” from the bottom left hand side of the page.
Once in the settings for the cmadmin user, there is a small icon on the left frame tht looks like two links of chain and is labelled “Webhooks“.
Next, add the https://master.example.com:8170/code-manager/v1/webhook?type=gitlab formatted webhook into the “URL” box.
The “prefix” section points to the name od the user based on the way GitLab uses namespaces in the URL.
Also select items you need from the list of options. I recommend selecting all items except “Build Events” and DE-select “Enable SSL Verification“.
Click “Add Webhook“.
Configuring the SSH Key
Finally, add the PUBLIC SSH key created on the Enterprise master located at /etc/puppetlabs/puppetserver/ssh/id-control_repo.rsa.pub to the SSH keys section for the CM Admin user in the GitLab Server.
While still impersonating the “cmadmin” user in the GitLab GUI Interface, choose the “cmadmin” icon in the lower left of the browser. Next, choose “Profile Settings” in the left hand bar.
Under the profile’s Settings, choose “SSH Keys” from the left hand bar.
Paste in the PUBLIC KEY to the “Key” text box. The Title text box should populate automatically. (or, you can name it yourself.)
Click “Add Key“.
Installing Code Manager
This process assumes you have follwed this entire series from start to here in order. The final steps are to install and configure the Code Manager itself. This is that process.
At the Puppet Enterprise Console, navigate to Nodes | Classification | PE Master | Classes Tab | puppet_enterprise::profile::master.
In the puppet_enterprise::profile::master class, you need to set the following parameters:
r10k_remote => ‘the git FQDN and path to the namespace/control_repo of this node.’
e.g. **git@git.example.com:cmadmin/control_repo.git**
r10k_private_key => ‘the full path to your deploy key on your Puppet enterprise master’
e.g. **/etc/puppetlabs/puppetserver/ssh/id-control_repo.rsa**
file_sync_enabled => true
code_manager_auto_configure => true
At this point, you also want to make sure your control_repo has a hieradata value set. If you cloned your repo from mine, you already have that value set in the common.yaml in the hieradata directory. That setting would be:
NOTE: Recall that GitLab has change dramatically since the original writing of this tutorial. Later versions allow you to authenticate the webhook. WHen I wrote this, I was working around technological limitations that are now gone. Feel free to complete this as needed, but I just wanted to disclaim the reasoning for these previous configuration steps.
Next, ensure the hiera.yaml lives in $confdir as needed for Code manager:
Edit the /etc/puppetlabs/puppet/puppet.conf file to ensure there is a line in the “[Main]” section: hiera_config = $confdir/hiera.yaml.
Finally, run the puppet agent to apply all the above configuration changes:
puppet agent -t
Test the hiera value on the command line to ensure Hiera has picked up your value:
NOTE: Later versions of Hiera respond to the “lookup” command. The older “hiera” command line utility has intermittent proper functioning at this time, and it has been recommended on the Puppet Community Slack that “Lookup” is the way to go at this time.
Generate Authentication Token
On the Puppet Enterprise Master, you must now generate an authentication token for the CM Admin deployment user to be authorized to push code. First, request the token:
It will request a username and password. Use the credentials you created in the RBAC console (In my example, cmadmin::puppetlabs) and the system will write the token to /root/.puppetlabs/token.
Time to restart!!
Run the puppet agent on all compile masters in no particular order:
**puppet agent -t**
Now, lets’ Test!!
Prior to PE 2016.x.x, you could only fire the tests with curl commands against the API. Those would be as follows:
At this point, you should see your code beginning to populate the /etc/puppetlabs/code-staging directory and then eventually the /etc/puppetlabs/code directory. Your final tests will include pushing code to the control_repo to test that the hook is working properly.
If all goes well, you should have code automatically deploy to the $codedir after a few seconds to a minute depending on a variety of factors.
Other Stuff
I wrote these as tutorials as I mentioned in the first article to help coworkers complete the same process I was doing. I had to sanitize out a lot of internal info, and I had to change hostnames on the fly to make sure “all the things” were secret that needed to be, so the names in question have not been specifically tested end-to-end, but the principles are the same.
I worked on both 2015.x.x and 2016.x.x with this process, but newer versions of PE may have different features or setup options not covered here. As with any “Open” documentation, “Your mileage may vary” and “Use at your own risk.”
I hope this helps someone out there get Code Manager setup and fuctioning in a Large Environment Installation scenario, and you scale as large as you need to as a result of the footwork I’ve done here. Feel free to email me for errors you find, and I’ll fix ’em up right away!
If you’ve been following for the past 5 installments, we’re nearing the end! Note that each of the prior articles required other things to have been completed before reading/performing the contained steps, but this article is a bit different. In all truth, you could do this process at any point, but I placed it here for one reason alone. “Why do this manually when I could get Puppet to do it for me?”
The importance of this particular step is that we need a place to hold our “control repo” (more on this later) and if you don’t already have Git installed in your environment, you’ll need it. So, before finishing up the installation and configuration of Code Manager, utilizing Puppet to install GitLab is a good test that everything is installed and configured properly, and all the components are communicating as expected.
Without further delay, let’s continue.
Create a Machine to Serve as the GitLab Server
Provision a new node according to our earlier chart to serve as your GitLab server. While I list specifications, you may find more mileage by scaling the Git server larger. If you will be expanding your Puppet team and will have dozens to hundreds of people developing for Puppet, scaling will be a consideration. Also, while outside the scope of this article, you will want to configure offsite backup and/or replication to a geographically separte location for your GitLab server. This is of paramount importance. If you lose this server, all configuration for all systems managed in all environments across your organization would be lost. This isn’t the end of the world in terms of business continuity, but trying to recreate all that code from the ground up would be prohibitive.
Yes, people will have recent copies of the repo on their local machines. Yes, with some nonzero level of effort, you should be able to get the repos back. No, it’s not fun, and you’ll have a bad time. Just back up your server, and if possible…replicate it elsewhere in your organization.
My intial suggested specifications on this server are:
I don’t specify disk for /opt and /var here, as each of these images carries ample disk with it. If you believe you will need additional storage for your Git instance, feel free to scale this as you see fit.
Once the server is installed, go ahead and install the Puppet Agent on it, pointing to the compiler VIP like so:
Once the agent installation is complete, in the Puppet Enterprise Console, navigate to Nodes | Unsigned Certificates and accept the new cert request for the GitLab server. Once that is complete, SSH to the GitLab server, and run puppet agent -t to complete the initial configuration of the node.
Create a Profile to Manage the GitLab Installation
On the Puppet Enterprise Master, install the vshn-gitlab module.
puppet module install vshn-gitlab
NOTE: You will need to perform this on ALL catalog compilers in your infrastructure. If the GitLab serer checks in and doesn’t find either the vshn-gitlab module or the profile you’re creating below on the master the load balancer refers it to, the catalog run will fail.
On the Puppet Enterprise Master (eg. master.example.com) create a new profile in $codedir/environments/production/modules/profiles/manifests/gitlab.pp.
(Puppet Enterprise has an internal variable for $codedir now. If you have made no modifications to this in the puppet.conf, the default location is /etc/puppetlabs/code.)
The profile you create should look like the following:
# Configure GitLab Server
class profiles::gitlab {
class { 'gitlab':
external_url => 'http://git.example.com',
}
}
Save this as gitlab.pp.
In the Puppet Enterprise Console, create a new classification group.
Navigate to Nodes | Classification
Create a group called ‘GitLab‘ with a parent of ‘All Nodes‘ in the Production Environment
Pin the git.example.com node into the newly created GitLab group.
Choose the ‘Classes‘ tab and click the ‘Refresh‘ icon to pick up your newly created profile.
Add the profiles::gitlab class to the classification group.
Commit the changes.
Caveats
Since we’re mid-setup and have multiple compilers but do not have code sync enabled, we have to manually copy the new profile to all your compilers in the same location. This allows the agent on the GitLab server to pick up the profile regardless of where the load balancer sends the agent request.
Once the profile is in place, run puppet agent -t on your GitLab server, and Puppet will then install the GitLab software onto the server. At this point, after a short delay, you should be able to retrieve your GitLab server in a browser (e.g. http://git.example.com) and login with the default credentials.
In our example, git.example.com is the server and the login would be automatically set to admin@example.com with a password of 5iveL!fe. These are defaults set by the GitLab installer.
Your GitLab server should now be up, running, and ready for action in your Puppet Environment. Look for the final installment to bring everything together and finish the installation.
As in the previous installment, you need to have already completed a few steps before arriving at this post. You should have already completed a “split installation” (Documented here). Also, your load balancer needs to be configured and running. The procedure for this portion can be found here. Finally, you should have the additional compilers installed and configured along with two example agent nodes as covered here and here.. If you’ve completed all these portions, you are now ready to configure ActiveMQ for scaling MCollective.
Once the preceding items are performed, you may find it necessary to add ActiveMQ hubs and spokes to increase capacity for MCollective and/or the Code Sync and Code Manager functions of Puppet Enterprise. This installment documents how to install these additional components and tie them into the existing infrastructure.
Create an ActiveMQ Hub
Go to the Puppet Enterprise Console in your browser.
Select Nodes | Classification and create a new group called “PE ActiveMQ Hub”
Stand up two new nodes for the ActiveMQ Hub and Spoke (in our example, activemq-hub.example.com and activemq-spoke.example.com) according to the following specifications:
Once your nodes have been provisioned, install the Puppet Agent on each node, making sure to point the installer DIRECTLY at the MoM (master.example.com**) instead of at the compiler VIP.
and let the agent install complete in its entirety.
Next, from your browser, retrieve the Puppet Enterprise Console and select the “PE ActiveMQ Hub” group you created earlier. Pin the activemq-hub.example.com node into the PE ActiveMQ Hub group.
Select the “Classes” tab and add a new class entitled: “puppet_enterprise::profile::amq::hub“
Click “Add Class“.
Under the Parameters drop-down, select “network_collector_spoke_collect_tag” and set its value to “pe-amq-network-connectors-for-activemq-hub.example.com“
Commit the changes.
SSH to the activemq-hub.example.com and run puppet agent -t to make all your changes effective for the Hub node.
Create ActiveMQ Spoke (or “broker”)
In the Puppet Enterprise Console, Select Nodes | Classification | PE ActiveMQ Broker
Pin your new ActiveMQ broker into the PE ActiveMQ Broker group.
Select the “Classes” tab.
Under the puppet_enterprise::profile::amp::broker class, choose the activemq_hubname parameter and set it to the FQDN of the hub you just created. In our case, activemq-hub.example.com.
SSH to the new broker (activemq-spoke.example.com) and run puppet agent -t.
Finally, unpin master.example.com from the PE ActiveMQ Broker group.
Conclusion
At this point, you should have:
Puppet Master of Masters – master.example.com
PuppetDB – puppetdb.example.com
PE Console – console.example.com
HAProxy Node – compiler.example.com
2 Catalog compilers – compile1.example.com and compile2.example.com
An ActiveMQ Hub – activemq-hub.example.com
An ActiveMQ Spoke – activemq-spoke.example.com
Two Agent Nodes – agent1.example.com and agent2.example.com
with their respective configurations. Your serving infrastructure is complete, and you are now ready to configure it for use.