Skip to content

ESXi Node Ready

Overview

This guide is intended to provide a higher-level overview of what is possible using DRP to create ESXi, node-ready, baremetal systems. This is intended for Architects and others that are looking for an understanding of how DRP can help discover, configure, provision and stage baremetal systems including the requisite setup for firmware, bios, storage, network and OS configuration. Detailed howtos, reference documentation and other pertinent documentation will be included to help disseminate to other roles.

Prerequisites

This is a general list for this document. Please refer to specific howtos and reference documentation for any other prerequisites that may not be discussed in this section.

This document assumes a physical system that matches typical, production ESXi systems and a DRP instance configured with universal and is correctly configured to communicate with the system and it's BMC. It is highly recommended that the system should meet The VMware Compatibility Guide.

Note

While you can use a virtual machine for a demo or a quick PoC, this approach does not fully show the depth and breadth of the capabilites DRP provides. We do not recommend this for anything other than a quick glance at certain parts of the process.

At a minimum the vmware content pack should be installed.

You will need to have a ESXi ISO available from VMware. We install two VIBs as part of the legacy install path. You can create a custom ISO that includes the VIBs. (TODO: Make sure we discuss SOMEWHERE where the VIBs are located)

It is important to be familiar with the architecture of the universal workflow system.

Pipeline

The benefit to using DRP is that you can drive a system to node ready using declarative language. The end state you desire will be driven through a pipeline to make sure the parameters given are met.

This section will walk through a typical ESXi node-ready pipeline.

TODO: Diagram of this

universal-discover

The system is booted into sledgehammer, where a basic system inventory is performed to help identify if the system is already discovered. More discovery is performed on the system's raid, bmc, bios, flash, and optionally the attached network-ports will be discovered using lldp. There are opportunities to iject tasks and/or external service calls before, during and after discovery using flexiflow. After the inventory, there is an opportunity to classify the system based on what is currently discovered. Validation is also possible after classification. The pipeline then chain-maps to the universal-hardware workflow.

universal-hardware

The system is run through BMC discovery, flash, raid and bios discovery. Raid is configured. BIOS current config is recorded and custom configuration is applied. Classification of the system occurs if desired. Tasks and external service call can be injected before, during and after each discovery. The pipeline then chain-maps to the desired esxi installation method based on machine profiles/parameters.

universal-esxi-kickstart

The kickstart installation method boots to the ESXi ISO to install using the standard kickstart method. By default a custom ISO must be created that includes two VIBs that provide firewall rules and our agent.

Note

Setting esxi/legacy-install to true will inject VIB installs for firewall rules and DRP agent during first-boot, eliminating the need for a custom ISO with the VIBs included.

Prior to booting into the ESXi ISO environment, some prep work occurs. Drives are wiped, passwords and the appropriate bootenv is selected. Once booted to the correct bootenv, the kickstart install occurs. After installation, the system is booted again. The appropriate VIB acceptance level is set. If a patchlist is applied, ESXi will be set to maintenance-mode long enough to apply the patches. Another opportunity for classification is available before chain-mapping to universal-esxi-config.

Workflow Diagram

flowchart TD
    subgraph pipeline: esxi-kickstart
        w1[workflow: esxi-kickstart] --> w2[workflow: universal-discover]
        w2[workflow: universal-discover] --> w3[workflow: universal-hardware]
        w3[workflow: universal-hardware] --> w4[workflow: universal-esxi-kickstart]
        w4[workflow: universal-esxi-kickstart] --> w5[workflow: universal-esxi-config]
        w5[workflow universal-esxi-configure] --> w6[workflow: universal-runbook]

        subgraph w4[workflow: universal-esxi-kickstart]
          direction TB
          s1 --> s2
          s2 --> s3
          s3 --> s4
          s4 --> s5
          s5 --> s6
          s6 --> s7
          s7 --> s8
          s8 --> s9
          s9 --> s10
          s10 --> s11
          s11 --> s12
          s12 --> s13
          s13 --> s14
          s14 --> s15
          s15 --> s16
          s16 --> s17
          s17 --> s18
          s18 --> s19
          s19 --> s20
          s20 --> s21
          s21[stage: complete]

          subgraph s1[stage: discover]
            direction TB
            s1t1[task: update-pipeline] --> s1t2
            s1t2[task: enforce-sledgehammer] --> s1t3
            s1t3[task: sledgehammer-set-working-python] --> s1t4
            s1t4[task: set-machine-ip-in-sledgehammer] --> s1t5
            s1t5[task: reserve-dhcp-address] --> s1t6
            s1t6[task: ssh-access] --> s1t7
            s1t7[task: record-current-uefi-boot-entry]
          end

          subgraph s2[stage: universal-esxi-kickstart-start-callback]
            s2t1[task: callback-task]
          end

          subgraph s3[stage: universal-esxi-kickstart-pre-flexiflow]
            s3t1[task: flexiflow-start] --> s3t2
            s3t2[task: flexiflow-stop]

          end

          subgraph s4[stage: prep-install]
            s4t1[task: erase-hard-disks-for-os-install]
          end

          subgraph s5[stage: vmware-esxi-clear-patch-index]
            s5t1[task: vmware-esxi-clear-patch-index]
          end

          subgraph s6[stage: vmware-esxi-set-password]
            s6t1[task: vmware-esxi-set-password]
          end

          subgraph s7[stage: vmware-esxi-selector]
            s7t1[task: vwmare-esxi-selector]
          end

          subgraph s8[stage: esxi-preserve-logs]
            s8t1[task: esxi-preserve-logs]
          end

          subgraph s9[stage: universal-esxi-kickstart-during-install-flexiflow]
            s9t1[task: flexiflow-start] --> s9t2
            s9t2[task: flexiflow-stop]
          end

          subgraph s10[stage: finish-install]
            s109b1[bootenv: local]
          end

          subgraph s11[stage: esxi-acceptance-level]
            s11t1[task: esxi-acceptance-level]
          end

          subgraph s12[stage: esxi-reorder-uefi-bootorder]
            s12t1[task: reorder-uefi-boot-order]
          end

          subgraph s13[stage: esxi-rename-datastore]
            s13t1[task: esxi-rename-datastore]
          end

          subgraph s14[stage: esxi-preserve-logs]
            s14t1[task: esxi-preserve-logs]
          end

          subgraph s15[stage: esxi-install-patches]
            s15t1[task: esxi-enable-maint-mode] --> s15t2
            s15t2[task: esxi-patch-install] --> s15t3
            s15t3[task: esxi-exit-maint-mode]
          end

          subgraph s16[stage: universal-esxi-kickstart-post-flexiflow]
            s16t1[task: flexiflow-start] --> s16t2
            s16t2[task: flexiflow-stop]

          end

          subgraph s17[stage: universal-esxi-kickstart-classification]
            s17t1[task: classify-stage-list-start] --> s17t2
            s17t2[task: classify-stage-list-stop]
          end

          subgraph s18[stage: universal-esxi-kickstart-post-validation]
            s18t1[task: validation-start] --> s18t2
            s18t2[task: validation-stop]
          end

          subgraph s19[stage: universal-esxi-kickstart-complete-callback]
            s19t1[task: callback-task]
          end

          subgraph s20[stage: universal-chain-workflow]
            s20t1[task: universal-chain-workflow]
          end
        end
    end

universal-esxi-config

The following represents a consolidated list of "important" stages and tasks that occur during universal-esxi-config. This is typically chain-mapped from universal-esxi-kickstart or universal-esxi-image workflows.

flowchart TD
    subgraph pipeline: esxi-kickstart
        w1[workflow: esxi-kickstart] --> w2[workflow: universal-discover]
        w2[workflow: universal-discover] --> w3[workflow: universal-hardware]
        w3[workflow: universal-hardware] --> w4[workflow: universal-esxi-kickstart]
        w4[workflow: universal-esxi-kickstart] --> w5[workflow: universal-esxi-config]
        w5[workflow universal-esxi-configure] --> w6[workflow: universal-runbook]

        subgraph w5[workflow: universal-esxi-config]
          direction TB
          s1 --> s2
          s2 --> s3
          s3 --> s4
          s4 --> s5
          s5 --> s6
          s6 --> s7
          s7 --> s8
          s8 --> s9
          s9 --> s10
          s10 --> s11
          s11 --> s12
          s12 --> s13
          s13 --> s14
          s14 --> s15
          s15 --> s16
          s16 --> s17
          s17 --> s18
          s18[stage: complete]

          subgraph s1[stage: universal-esxi-config-start-callback]
            s1t1[task: callback-task]
          end

          subgraph s2[stage: universal-esxi-config-pre-flexiflow]
            s2t1[task: flexiflow-start] --> s2t2
            s2t2[task: flexiflow-stop]
          end

          subgraph s3[stage: esxi-acceptance-level]
            s3t1[task: esxi-acceptance-level]

          end

          subgraph s4[stage: esxi-rename-datastore]
            s4t1[task: esxi-rename-datastore]
          end

          subgraph s5[stage: esxi-activate-network]
            direction TB
            s5t1[task: esxi-set-hostname] --> s5t2
            s5t2[task: esxi-set-network] --> s5t3
            s5t3[task: esxi-set-network-protocol] --> s5t4
            s5t4[task: esxi-set-dns] --> s5t5
            s5t5[task: esxi-set-ntp]
          end

          subgraph s6[stage: esxi-activate-shells]
            s6t1[task: esxi-activate-shells]
          end

          subgraph s7[stage: esxi-activate-nested]
            s7t1[task: esxi-activate-nesting]
          end

          subgraph s8[stage: esxi-activate-password-policy]
            s8t1[task: esxi-password-security-policy]
          end

          subgraph s9[stage: esxi-manage-users]
            s9t1[task: esxi-manage-users]
          end

          subgraph s10[stage: esxi-install-welcome]
            s109b1[task: esxi-install-welcome]
          end

          subgraph s11[stage: esxi-install-certificate]
            s11t1[task: esxi-install-certificate]
          end

          subgraph s12[stage: esxi-preserve-logs]
            s12t1[task: esxi-preserve-logs]
          end

          subgraph s13[stage: universal-esxi-config-post-flexiflow]
            s13t1[task: flexiflow-start] --> s13t2
            s13t2[task: flexiflow-stop]
          end

          subgraph s14[stage: universal-esxi-config-classification]
            s14t1[task: classify-stage-list-start] --> s14t2
            s14t2[task: classify-stage-list-stop]
          end

          subgraph s15[stage: universal-esxi-config-post-validation]
            s15t1[task: validation-start] --> s15t2
            s15t2[task: validation-stop]
          end

          subgraph s16[stage: universal-esxi-config-complete-callback]
            s16t1[task: callback-task]

          end

          subgraph s17[stage: universal-chain-workflow]
            s17t2[task: universal-chain-workflow]
          end
        end
    end

After installing and first-boot, the VIB acceptance level can be adjusted, default datastore name is configured if it hasn't been already. Hostname, network, remote shells, nesting, password policy, users, welcome screen, and setting SSL certificates is performed. As with classification, task injection and external service calls are available throughout the workflow before chain-mapping to universal-runbook.

universal-runbook

This workflow merges universal-discover and universal-start. It provides a post-install/start path to finalize any other tasks. As with other workflows, there are opportunities for classification, task injection and external service calls.