|
|
|
|
|
Description |
Implementation of cluster-wide logic.
This module holds all pure cluster-logic; I/O related functionality
goes into the Main module for the individual binaries.
|
|
Synopsis |
|
data AllocSolution = AllocSolution {} | | data EvacSolution = EvacSolution {} | | type AllocResult = (FailStats, List, List, [Instance], [CStats]) | | type AllocNodes = Either [Ndx] [(Ndx, [Ndx])] | | emptyAllocSolution :: AllocSolution | | emptyEvacSolution :: EvacSolution | | data Table = Table List List Score [Placement] | | data CStats = CStats {} | | type AllocMethod = List -> List -> Maybe Int -> Instance -> AllocNodes -> [Instance] -> [CStats] -> Result AllocResult | | type EvacInnerState = Either String (List, Instance, Score, Ndx) | | verifyN1 :: [Node] -> [Node] | | computeBadItems :: List -> List -> ([Node], [Instance]) | | instanceNodes :: List -> Instance -> (Ndx, Ndx, Node, Node) | | emptyCStats :: CStats | | updateCStats :: CStats -> Node -> CStats | | totalResources :: List -> CStats | | computeAllocationDelta :: CStats -> CStats -> AllocStats | | detailedCVInfo :: [(Double, String)] | | detailedCVWeights :: [Double] | | compDetailedCV :: [Node] -> [Double] | | compCVNodes :: [Node] -> Double | | compCV :: List -> Double | | getOnline :: List -> [Node] | | compareTables :: Table -> Table -> Table | | applyMove :: List -> Instance -> IMove -> OpResult (List, Instance, Ndx, Ndx) | | allocateOnSingle :: List -> Instance -> Ndx -> OpResult AllocElement | | allocateOnPair :: List -> Instance -> Ndx -> Ndx -> OpResult AllocElement | | checkSingleStep :: Table -> Instance -> Table -> IMove -> Table | | possibleMoves :: MirrorType -> Bool -> Bool -> Ndx -> [IMove] | | checkInstanceMove :: [Ndx] -> Bool -> Bool -> Table -> Instance -> Table | | checkMove :: [Ndx] -> Bool -> Bool -> Table -> [Instance] -> Table | | doNextBalance :: Table -> Int -> Score -> Bool | | tryBalance :: Table -> Bool -> Bool -> Bool -> Score -> Score -> Maybe Table | | collapseFailures :: [FailMode] -> FailStats | | bestAllocElement :: Maybe AllocElement -> Maybe AllocElement -> Maybe AllocElement | | concatAllocs :: AllocSolution -> OpResult AllocElement -> AllocSolution | | sumAllocs :: AllocSolution -> AllocSolution -> AllocSolution | | describeSolution :: AllocSolution -> String | | annotateSolution :: AllocSolution -> AllocSolution | | reverseEvacSolution :: EvacSolution -> EvacSolution | | genAllocNodes :: List -> List -> Int -> Bool -> Result AllocNodes | | tryAlloc :: Monad m => List -> List -> Instance -> AllocNodes -> m AllocSolution | | solutionDescription :: List -> (Gdx, Result AllocSolution) -> [String] | | filterMGResults :: List -> [(Gdx, Result AllocSolution)] -> [(Gdx, AllocSolution)] | | sortMGResults :: List -> [(Gdx, AllocSolution)] -> [(Gdx, AllocSolution)] | | findBestAllocGroup :: List -> List -> List -> Maybe [Gdx] -> Instance -> Int -> Result (Gdx, AllocSolution, [String]) | | tryMGAlloc :: List -> List -> List -> Instance -> Int -> Result AllocSolution | | failOnSecondaryChange :: Monad m => EvacMode -> DiskTemplate -> m () | | nodeEvacInstance :: List -> List -> EvacMode -> Instance -> Gdx -> [Ndx] -> Result (List, List, [OpCode]) | | evacOneNodeOnly :: List -> List -> Instance -> Gdx -> [Ndx] -> Result (List, List, [OpCode]) | | evacOneNodeInner :: List -> Instance -> Gdx -> (Ndx -> IMove) -> EvacInnerState -> Ndx -> EvacInnerState | | evacDrbdAllInner :: List -> List -> Instance -> Gdx -> (Ndx, Ndx) -> Result (List, List, [OpCode], Score) | | availableGroupNodes :: [(Gdx, [Ndx])] -> IntSet -> Gdx -> Result [Ndx] | | updateEvacSolution :: (List, List, EvacSolution) -> Idx -> Result (List, List, [OpCode]) -> (List, List, EvacSolution) | | tryNodeEvac :: List -> List -> List -> EvacMode -> [Idx] -> Result (List, List, EvacSolution) | | tryChangeGroup :: List -> List -> List -> [Gdx] -> [Idx] -> Result (List, List, EvacSolution) | | iterateAlloc :: AllocMethod | | tieredAlloc :: AllocMethod | | computeMoves :: Instance -> String -> IMove -> String -> String -> (String, [String]) | | printSolutionLine :: List -> List -> Int -> Int -> Placement -> Int -> (String, [String]) | | involvedNodes :: List -> Placement -> [Ndx] | | mergeJobs :: ([JobSet], [Ndx]) -> MoveJob -> ([JobSet], [Ndx]) | | splitJobs :: [MoveJob] -> [JobSet] | | formatJob :: Int -> Int -> (Int, MoveJob) -> [String] | | formatCmds :: [JobSet] -> String | | printNodes :: List -> [String] -> String | | printInsts :: List -> List -> String | | printStats :: String -> List -> String | | iMoveToJob :: List -> List -> Idx -> IMove -> [OpCode] | | instanceGroup :: List -> Instance -> Result Gdx | | instancePriGroup :: List -> Instance -> Gdx | | findSplitInstances :: List -> List -> [Instance] | | splitCluster :: List -> List -> [(Gdx, (List, List))] | | nodesToEvacuate :: List -> EvacMode -> [Idx] -> IntSet |
|
|
|
Types
|
|
|
Allocation/relocation solution.
| Constructors | AllocSolution | | asFailures :: [FailMode] | Failure counts
| asAllocs :: Int | Good allocation count
| asSolution :: Maybe AllocElement | The actual allocation result
| asLog :: [String] | Informational messages
|
|
|
|
|
|
Node evacuation/group change iallocator result type. This result
type consists of actual opcodes (a restricted subset) that are
transmitted back to Ganeti.
| Constructors | EvacSolution | | esMoved :: [(Idx, Gdx, [Ndx])] | Instances moved successfully
| esFailed :: [(Idx, String)] | Instances which were not
relocated
| esOpCodes :: [[OpCode]] | List of jobs
|
|
|
|
|
|
Allocation results, as used in iterateAlloc and tieredAlloc.
|
|
|
A type denoting the valid allocation mode/pairs.
For a one-node allocation, this will be a Left [Ndx], whereas
for a two-node allocation, this will be a Right [(Ndx,
[Ndx])]. In the latter case, the list is basically an
association list, grouped by primary node and holding the potential
secondary nodes in the sub-list.
|
|
|
The empty solution we start with when computing allocations.
|
|
|
The empty evac solution.
|
|
|
The complete state for the balancing solution.
| Constructors | |
|
|
|
Cluster statistics data type.
| Constructors | CStats | | csFmem :: Integer | Cluster free mem
| csFdsk :: Integer | Cluster free disk
| csAmem :: Integer | Cluster allocatable mem
| csAdsk :: Integer | Cluster allocatable disk
| csAcpu :: Integer | Cluster allocatable cpus
| csMmem :: Integer | Max node allocatable mem
| csMdsk :: Integer | Max node allocatable disk
| csMcpu :: Integer | Max node allocatable cpu
| csImem :: Integer | Instance used mem
| csIdsk :: Integer | Instance used disk
| csIcpu :: Integer | Instance used cpu
| csTmem :: Double | Cluster total mem
| csTdsk :: Double | Cluster total disk
| csTcpu :: Double | Cluster total cpus
| csVcpu :: Integer | Cluster total virtual cpus
| csNcpu :: Double | Equivalent to csIcpu but in terms of
physical CPUs, i.e. normalised used phys CPUs
| csXmem :: Integer | Unnacounted for mem
| csNmem :: Integer | Node own memory
| csScore :: Score | The cluster score
| csNinst :: Int | The total number of instances
|
|
|
|
|
|
= List | Node list
| -> List | Instance list
| -> Maybe Int | Optional allocation limit
| -> Instance | Instance spec for allocation
| -> AllocNodes | Which nodes we should allocate on
| -> [Instance] | Allocated instances
| -> [CStats] | Running cluster stats
| -> Result AllocResult | Allocation result
| A simple type for allocation functions.
|
|
|
|
A simple type for the running solution of evacuations.
|
|
Utility functions
|
|
|
Verifies the N+1 status and return the affected nodes.
|
|
|
Computes the pair of bad nodes and instances.
The bad node list is computed via a simple verifyN1 check, and the
bad instance list is the list of primary and secondary instances of
those nodes.
|
|
|
Extracts the node pairs for an instance. This can fail if the
instance is single-homed. FIXME: this needs to be improved,
together with the general enhancement for handling non-DRBD moves.
|
|
|
Zero-initializer for the CStats type.
|
|
|
Update stats with data from a new node.
|
|
|
Compute the total free disk and memory in the cluster.
|
|
|
Compute the delta between two cluster state.
This is used when doing allocations, to understand better the
available cluster resources. The return value is a triple of the
current used values, the delta that was still allocated, and what
was left unallocated.
|
|
detailedCVInfo :: [(Double, String)] | Source |
|
The names and weights of the individual elements in the CV list.
|
|
detailedCVWeights :: [Double] | Source |
|
Holds the weights used by compCVNodes for each metric.
|
|
|
Compute the mem and disk covariance.
|
|
|
Compute the total variance.
|
|
|
Wrapper over compCVNodes for callers that have a List.
|
|
|
Compute online nodes from a List.
|
|
Balancing functions
|
|
|
Compute best table. Note that the ordering of the arguments is important.
|
|
|
Applies an instance move to a given node list and instance.
|
|
|
Tries to allocate an instance on one given node.
|
|
|
Tries to allocate an instance on a given pair of nodes.
|
|
|
:: Table | The original table
| -> Instance | The instance to move
| -> Table | The current best table
| -> IMove | The move to apply
| -> Table | The final best table
| Tries to perform an instance move and returns the best table
between the original one and the new one.
|
|
|
|
:: MirrorType | The mirroring type of the instance
| -> Bool | Whether the secondary node is a valid new node
| -> Bool | Whether we can change the primary node
| -> Ndx | Target node candidate
| -> [IMove] | List of valid result moves
| Given the status of the current secondary as a valid new node and
the current candidate target node, generate the possible moves for
a instance.
|
|
|
|
:: [Ndx] | Allowed target node indices
| -> Bool | Whether disk moves are allowed
| -> Bool | Whether instance moves are allowed
| -> Table | Original table
| -> Instance | Instance to move
| -> Table | Best new table for this instance
| Compute the best move for a given instance.
|
|
|
|
:: [Ndx] | Allowed target node indices
| -> Bool | Whether disk moves are allowed
| -> Bool | Whether instance moves are allowed
| -> Table | The current solution
| -> [Instance] | List of instances still to move
| -> Table | The new solution
| Compute the best next move.
|
|
|
|
:: Table | The starting table
| -> Int | Remaining length
| -> Score | Score at which to stop
| -> Bool | The resulting table and commands
| Check if we are allowed to go deeper in the balancing.
|
|
|
|
:: Table | The starting table
| -> Bool | Allow disk moves
| -> Bool | Allow instance moves
| -> Bool | Only evacuate moves
| -> Score | Min gain threshold
| -> Score | Min gain
| -> Maybe Table | The resulting table and commands
| Run a balance move.
|
|
|
Allocation functions
|
|
|
Build failure stats out of a list of failures.
|
|
|
Compares two Maybe AllocElement and chooses the besst score.
|
|
|
Update current Allocation solution and failure stats with new
elements.
|
|
|
Sums two AllocSolution structures.
|
|
|
Given a solution, generates a reasonable description for it.
|
|
|
Annotates a solution with the appropriate string.
|
|
|
Reverses an evacuation solution.
Rationale: we always concat the results to the top of the lists, so
for proper jobset execution, we should reverse all lists.
|
|
|
:: List | Group list
| -> List | The node map
| -> Int | The number of nodes required
| -> Bool | Whether to drop or not
unallocable nodes
| -> Result AllocNodes | The (monadic) result
| Generate the valid node allocation singles or pairs for a new instance.
|
|
|
|
:: Monad m | | => List | The instance list
| -> List | The instance to allocate
| -> Instance | The allocation targets
| -> AllocNodes | Possible solution list
| -> m AllocSolution | | Try to allocate an instance on the cluster.
|
|
|
|
Given a group/result, describe it as a nice (list of) messages.
|
|
|
From a list of possibly bad and possibly empty solutions, filter
only the groups with a valid result. Note that the result will be
reversed compared to the original list.
|
|
|
Sort multigroup results based on policy and score.
|
|
|
:: List | The group list
| -> List | The node list
| -> List | The instance list
| -> Maybe [Gdx] | The allowed groups
| -> Instance | The instance to allocate
| -> Int | Required number of nodes
| -> Result (Gdx, AllocSolution, [String]) | | Finds the best group for an instance on a multi-group cluster.
Only solutions in preferred and last_resort groups will be
accepted as valid, and additionally if the allowed groups parameter
is not null then allocation will only be run for those group
indices.
|
|
|
|
:: List | The group list
| -> List | The node list
| -> List | The instance list
| -> Instance | The instance to allocate
| -> Int | Required number of nodes
| -> Result AllocSolution | Possible solution list
| Try to allocate an instance on a multi-group cluster.
|
|
|
|
Function which fails if the requested mode is change secondary.
This is useful since except DRBD, no other disk template can
execute change secondary; thus, we can just call this function
instead of always checking for secondary mode. After the call to
this function, whatever mode we have is just a primary change.
|
|
|
:: List | The node list (cluster-wide)
| -> List | Instance list (cluster-wide)
| -> EvacMode | The evacuation mode
| -> Instance | The instance to be evacuated
| -> Gdx | The group we're targetting
| -> [Ndx] | The list of available nodes
for allocation
| -> Result (List, List, [OpCode]) | | Run evacuation for a single instance.
Note: this function should correctly execute both intra-group
evacuations (in all modes) and inter-group evacuations (in the
ChangeAll mode). Of course, this requires that the correct list
of target nodes is passed.
|
|
|
|
:: List | The node list (cluster-wide)
| -> List | Instance list (cluster-wide)
| -> Instance | The instance to be evacuated
| -> Gdx | The group we're targetting
| -> [Ndx] | The list of available nodes
for allocation
| -> Result (List, List, [OpCode]) | | Generic function for changing one node of an instance.
This is similar to nodeEvacInstance but will be used in a few of
its sub-patterns. It folds the inner function evacOneNodeInner
over the list of available nodes, which results in the best choice
for relocation.
|
|
|
|
:: List | Cluster node list
| -> Instance | Instance being evacuated
| -> Gdx | The group index of the instance
| -> Ndx -> IMove | Operation constructor
| -> EvacInnerState | Current best solution
| -> Ndx | Node we're evaluating as target
| -> EvacInnerState | New best solution
| Inner fold function for changing one node of an instance.
Depending on the instance disk template, this will either change
the secondary (for DRBD) or the primary node (for shared
storage). However, the operation is generic otherwise.
The running solution is either a Left String, which means we
don't have yet a working solution, or a Right (...), which
represents a valid solution; it holds the modified node list, the
modified instance (after evacuation), the score of that solution,
and the new secondary node index.
|
|
|
|
:: List | Cluster node list
| -> List | Cluster instance list
| -> Instance | The instance to be moved
| -> Gdx | The target group index
(which can differ from the
current group of the
instance)
| -> (Ndx, Ndx) | Tuple of new
primary/secondary nodes
| -> Result (List, List, [OpCode], Score) | | Compute result of changing all nodes of a DRBD instance.
Given the target primary and secondary node (which might be in a
different group or not), this function will execute all the
required steps and assuming all operations succceed, will return
the modified node and instance lists, the opcodes needed for this
and the new group score.
|
|
|
|
:: [(Gdx, [Ndx])] | Group index/node index assoc list
| -> IntSet | Nodes that are excluded
| -> Gdx | The group for which we
query the nodes
| -> Result [Ndx] | List of available node indices
| Computes the nodes in a given group which are available for
allocation.
|
|
|
|
Updates the evac solution with the results of an instance
evacuation.
|
|
|
:: List | The cluster groups
| -> List | The node list (cluster-wide, not per group)
| -> List | Instance list (cluster-wide)
| -> EvacMode | The evacuation mode
| -> [Idx] | List of instance (indices) to be evacuated
| -> Result (List, List, EvacSolution) | | Node-evacuation IAllocator mode main function.
|
|
|
|
:: List | The cluster groups
| -> List | The node list (cluster-wide)
| -> List | Instance list (cluster-wide)
| -> [Gdx] | Target groups; if empty, any
groups not being evacuated
| -> [Idx] | List of instance (indices) to be evacuated
| -> Result (List, List, EvacSolution) | | Change-group IAllocator mode main function.
This is very similar to tryNodeEvac, the only difference is that
we don't choose as target group the current instance group, but
instead:
1. at the start of the function, we compute which are the target
groups; either no groups were passed in, in which case we choose
all groups out of which we don't evacuate instance, or there were
some groups passed, in which case we use those
2. for each instance, we use findBestAllocGroup to choose the
best group to hold the instance, and then we do what
tryNodeEvac does, except for this group instead of the current
instance group.
Note that the correct behaviour of this function relies on the
function nodeEvacInstance to be able to do correctly both
intra-group and inter-group moves when passed the ChangeAll mode.
|
|
|
|
Standard-sized allocation method.
This places instances of the same size on the cluster until we're
out of space. The result will be a list of identically-sized
instances.
|
|
|
Tiered allocation method.
This places instances on the cluster, and decreases the spec until
we can allocate again. The result will be a list of decreasing
instance specs.
|
|
Formatting functions
|
|
|
:: Instance | The instance to be moved
| -> String | The instance name
| -> IMove | The move being performed
| -> String | New primary
| -> String | New secondary
| -> (String, [String]) | Tuple of moves and commands list; moves is containing
either f for failover or r:name for replace
secondary, while the command list holds gnt-instance
commands (without that prefix), e.g "failover instance1"
| Given the original and final nodes, computes the relocation description.
|
|
|
|
:: List | The node list
| -> List | The instance list
| -> Int | Maximum node name length
| -> Int | Maximum instance name length
| -> Placement | The current placement
| -> Int | The index of the placement in
the solution
| -> (String, [String]) | | Converts a placement to string format.
|
|
|
|
:: List | Instance list, used for retrieving
the instance from its index; note
that this must be the original
instance list, so that we can
retrieve the old nodes
| -> Placement | The placement we're investigating,
containing the new nodes and
instance index
| -> [Ndx] | Resulting list of node indices
| Return the instance and involved nodes in an instance move.
Note that the output list length can vary, and is not required nor
guaranteed to be of any specific length.
|
|
|
|
Inner function for splitJobs, that either appends the next job to
the current jobset, or starts a new jobset.
|
|
|
Break a list of moves into independent groups. Note that this
will reverse the order of jobs.
|
|
|
Given a list of commands, prefix them with gnt-instance and
also beautify the display a little.
|
|
|
Given a list of commands, prefix them with gnt-instance and
also beautify the display a little.
|
|
|
Print the node list.
|
|
|
Print the instance list.
|
|
|
Shows statistics for a given node list.
|
|
|
:: List | The node list; only used for node
names, so any version is good
(before or after the operation)
| -> List | The instance list; also used for
names only
| -> Idx | The index of the instance being
moved
| -> IMove | The actual move to be described
| -> [OpCode] | The list of opcodes equivalent to
the given move
| Convert a placement into a list of OpCodes (basically a job).
|
|
|
Node group functions
|
|
|
Computes the group of an instance.
|
|
|
Computes the group of an instance per the primary node.
|
|
|
Compute the list of badly allocated instances (split across node
groups).
|
|
|
Splits a cluster into the component node groups.
|
|
|
:: List | The cluster-wide instance list
| -> EvacMode | The evacuation mode we're using
| -> [Idx] | List of instance indices being evacuated
| -> IntSet | Set of node indices
| Compute the list of nodes that are to be evacuated, given a list
of instances and an evacuation mode.
|
|
|
Produced by Haddock version 2.6.0 |