Currently, factories set a lot of emphasis on the specialized flexible manufacturing cells. Flexible systems have high ability to adapt to the large fluctuations of the process demands so there has been an increasing trend towards the implementation of a robot as a material/workpiece handling device (MHD). The industrial robot forms a workstation base around which all other resources are placed depending on the application specifics. Furthermore, a robot workstation layout is defined by the product or by planned sequence of operations. Having the robot on one side and a basic type of organization such as a product layout on the other side, the most practical layout configuration is a circular layout. Regarding robot representation, there are four major groups of robot configurations: delta robots, SCARA robots, Cartesian robots and articulated robots. The first three groups are mostly used in pick and place, packaging, assembling and CNC machine tooling applications. In these sorts of applications their speed and precision overshadow payload and working envelope limitations. We consider a six-axis jointed-arm industrial robot (also known as an articulated robot) as the most commonly used robot configuration. It can be used in many different continuous-path and point-to-point controlled operations like spraying, welding, cutting, debarring, polishing, quality check, assembling, sandblasting, etc. The robot is fixed in the centre of the workstation. As an MHD, the robot is used to transport the workpiece between the resources surrounding it while the resources are chosen by the needs of the operation sequence. The first and the last resource in the operation sequence are equipped with an independent system to input/output workpiece to and off the workstation (manual/autonomous). As the first discussion of the failure recovery influence on the layout design, a static layout analysis will be applied. The goal of each workstation designer is to assign resources to the available work area so that some objectives are satisfied. The optimal workstation layout reduces the manufacturing lead time and increases productivity. A key question of the machine layout problem (MLP) is how to obtain an effective resource arrangement. During the last decades there has been a tremendous volume of publications that focuses on this area, especially for the facility layout problem (FLP). This is quite understandable because previous research had shown that an effective FLP/MLP solution reduces the overall system cost. Another important issue, which kept researchers occupied last decade and more, was the fact that the MLP is generally complex and NP-hard. Depending upon the volume-variety characteristics of the operation there are four types of layout organizations: a fixed-position layout, a functional layout, a cell layout and a product layout. The layout design can be analyzed through some basic configurations such as a single row, a multi-floor, a loop layout, etc. Determined by the layout evolution we distinguish a static layout and a dynamic layout. Most researchers consider FLP/MLP as an optimization problem and, therefore, the layout formulation has to be defined. Some discrete formulations are very often addressed as instances of the quadratic assignment problem (QAP), whereas a continual formulation is presented with the mixed integer programming problems (MIP). Resolution approaches to FLP/MLP have provided numerous articles focused on heuristic and metaheuristic methods. The currently active field of research can be divided into three groups by the type of heuristic approach: global search methods (simulated annealing (SA) and tabu search), hybrid approaches and evolutionary approaches. Some researchers discuss MLP in a robot cell using the joint coordinates of the robot used. With the constant joint velocity hypothesis, they use an algorithm based on the sequential quadratic programming (SQP) to show that minimization of the total travel displacement has the better computational performance than minimization of the total travel time. Guided by the same principles, there has been presented a model of robot joint displacement as a function of time. The presented solution was a layout with feasible robot configurations optimized for the total cycle time. A fixed-base robot, one operation sequence and a rectangular resource shape approximation are some of the assumptions postulated later in this field. The authors claim that by using their heuristic algorithm on the given layout, a 45% reduction can be achieved in the total distance travelled by the robot arm. Some authors also state that the type of MHD determines the pattern to be used for MLP. They also show that the QAP cannot be used to formulate MLP because machine shapes and sizes are generally not equal. Therefore, the fact that the distance between the resources is assumed to be constant in the QAP objective function makes it unsuitable for MLP. There are papers dedicated to a robot workstation layout design, but, to the best of our knowledge, none considers the importance of the contribution of a failure recovery to the layout design, while keeping the cost reduction as the main objective. According to some research, worldwide studies indicate that the average overall equipment effectiveness rate (OEE) in manufacturing facilities is 60%. A world class OEE is considered to be 85% or better. Clearly, there is some space for improvement. Moreover, current research results show that equipment breakdowns or failures are classified as one of the two major downtime factors. Scheduled maintenance is the second major factor to reduce OEE, but as a part of planned maintenance it is already considered in the equipment availability calculation (i.e. operating time and planned production time). These are the obvious reasons why it is important to consider the resource failures in MLP in order to reach better OEE. Unlike other previous works, which assume some of the important robot workstation aspects only partially (e.g. resource shape, size, overlapping, collision-free paths, robot kinematics, layout type, space representation.) while, trying to solve MLP, this paper provides a complete failure recovery strategy with the full consideration to all the mentioned robot workstation aspects. To overcome computationally expensive resource approximations we assume a circular shape approximation. With this approximation, the circle radius is the only input parameter needed to define resource size and the rest of the approximated area (not covered with the real resource shape) necessary for the peripheral equipment (i.e. each CNC machine has at least one oil tank, chip container, tool rack, etc.). A reasonable assumption is to equate the approximation centre with the pick/drop point of the resource. The position of the pick and drop point is generally unique for the majority of resources (e.g. actuators, CNC machines, dimension control systems, etc.). The height component of resources is not considered since resources should be arranged with enough maintenance space between them (base stand should be designed so that the MHD can reach the height of standard resources pick/drop points in the working area). Additionally, a rate of the resource failure and a treatment quality are introduced as new system recovery parameters. The design of a robot workstation layout is the first step of the failure recovery strategy. Here we present a new optimization problem formulation including the failure recovery parameters in the objective function. A conventional layout organization, based on the total material handling cost minimization, is extended with a new term that minimizes the distance between the resource in a failure condition and the one that has operational coincidence (recovery option). A new layout organization is defined based on the resource rate of failure and recovery option treatment quality. The presented objective function is a nonlinear optimization problem (NLP). To solve the optimization problem, a state of the art algorithm is used. Interior-point methods are very effective for NLP problems where the objective function and all constraints are convex. Solvers can easily reach the global optimum up to a very large problem size. A problem arises if some of the constraints or the objective functions are non-convex. Multiple locally optimal solutions can prolong computation time exponentially. This is the reason why we chose the most widely used method for solving NLP problems, i.e. SQP (an active-set method). The active set of constraints determines which constraint will influence the final solution. Necessary conditions for optimality are given with the Karush–Kuhn– Tucker (KKT) equations (basically a generalization of Lagrange multiplier method which allows only equality constraints). KKT multipliers give the algorithm information where the gradient of the objective function has the same direction as the gradient of its constraints which should be in the constrained minimum, having in mind that we are moving in the direction of negative gradient when minimizing. To transform the problem into an easier sub problem, the SQP method uses quadratic approximation of the objective and linearization of the constraints. Instead of computing the Hessian matrix (Newton’s method), quasi-Newton methods update the Hessian (its approximation) of the Lagrangian function at each major iteration. Solution of the quadratic sub problem gives the descent direction for the line search. The most popular method to update the Hessian approximation is the BFGS update formula. Solving the NLP by the SQP is highly dependent on the initial condition selection. To deliver a feasible solution, the SQP algorithm must start from good initial positions of resources. Starting from a poor initial solution, with addition to imposed constraints, the algorithm will not be able to improve the solution. To overcome this problem, a set of random initial solutions normally distributed across the complete working area are generated. The reasons why we choose this option is high computational efficiency of SQP algorithm and low chance of reaching a local minimum (starting from many different conditions we are giving the algorithm a dimension of global search). The algorithm criterion value, exit flags or step-size (line search in active-set method) can be used to analyse solution feasibility after constraints and bounds validation. The next step of the failure recovery strategy is related to the control design of such a system. Earlier in their research, authors presented the generic algorithm suitable for a workstation control logic implementation, while here we present a modified failure recovery algorithm with the more practical recovery resolution technique. An exact and clear approach is given by means of weighted cycle time comparison and offline simulation data where an operational recovery decision can be resolved in the most safe and cost-effective way. For a circular layout configuration workstation, with an industrial robot fixed in the centre and other resources around it, one can use an augmented Petri net approach to reinstate the system from a failure to a normal state. There are four failure types in a robot workstation: malfunction, irregularity, collision and sensory induced error. Generally, it is impossible to recover from a robot malfunction (i.e. human intervention is required) or malfunction of any other important part of workstation equipment and from terminal collision (i.e. collision situation where human intervention is required). Still, there is a possibility for a system recovery in case of other failure types. Better understanding of possible failures leads to a more complete resolution strategy. All possible failure states of a system cannot be resolved by passive prevention (guaranteed with system Petri net model). The concept of augmented Petri nets, will guarantee that the system will follow predetermined recovery steps in order to reach a normal state. There are three possible recovery scenarios proposed in current research: input conditioning, forward failure recovery and backward failure recovery. What this means in terms of a Petri net formalism is an augmentation of a basic system model which will be deleted after successful recovery (i.e. achieving a normal state of the system). It is possible to wait for some conditions to be fulfilled or to recover to some later or previous system state that can be reached from the failure state over the recovery subnet. To prepare all possible recovery steps, offline simulation data should be used. Three-dimensional motion packages with CAD simulation studios can be very helpful to prepare operational recovery options and all other safe recovery procedures. There are some algorithms presented in the literature that are associated with automated failure recovery. For example, an algorithm for constructing a failure propagation tree is presented as a part of the diagnostic step in a failure recovery process. A hierarchical scheme for the operational control of an FMS is also presented in current research forming a production policies and scheduling algorithm that can anticipate workstation failures. Compared to these algorithms, the algorithm presented in this paper is a low level workstation control algorithm that supervises the operations of the workstation. Possibility to implement introduced algorithm into a workstation controller and exact reasoning in case of failure recovery (red boxes in flowchart) are true benefits of presented algorithm. The failure recovery control strategy is based on the assumption that the operational parameters, such as treatment quality and operation cycle time, are not necessary to be considered in the case of collision, malfunction and sensory caused irregularity if these failures are not related to some operation oi. For example, in the case of controller malfunction, we do not need to consider operational parameters to decide about the next step. If a failure situation is foreseeable and we are able to determine which conditions must be satisfied (offline preparation is necessary), it is clear which system state should be next (e.g. delay, operator action, maintenance). For operational failure, which is any failure related to the piece treatment (e.g. some resource malfunction, tool break detection.), the decision becomes critical when different recovery options are possible. In this case, like in layout design, we have to consider workstation productivity and the piece treatment quality. For every operational coincidence (i.e. similarities between operations) between resources, there are specific recovery options for planned operations. A system is normally designed to treat the workpiece at dedicated resource with the highest quality treatment and planned cycle time. If there is a possibility to treat the workpiece on a different resource (that is able to perform problematic operation), where quality treatment is higher than some minimal acceptable value (defined by system quality experts), this will be a possible recovery option for operation oi (and also available system augmentation). Except for the lower quality treatment, recovery option cycle time will last longer due to some technical and physical limitations of the recovery resource. A cycle time of some operation is chosen to carry the important decision about recovery augmentation. To reduce coefficients to the same unit, we will use the score system. The point of the quality score coefficient is to convert the loss of quality into the loss of time (i.e. weighted method gives possibility to punish treatment where quality is not at 100%). This means that if the algorithm will have to decide about the next step, based on a variable expressed in seconds, then for all recovery options a weighted cycle time comparison will be performed. The arbitrary weight factor can be chosen in a way to stress time lost or gained. If the dedicated resource is damaged and currently stopped, the system is rerouted to perform problematic operation on recovery resource. For MHD (robot) this can mean longer transporting time. Various differences between resources and operations can add even more time to the transporting cycle in the case of recovery augmentation. Additionally, there is a chance that the robot will have to spend additional time to change the workpiece orientation so the system can overcome differences in workpiece or resource geometry (i.e. after such change, the recovery resource that is chosen by the control algorithm will be able to accept the workpiece in a different operational stage). In a flexible manufacturing cell served by a robot, it is essential that any layout or control logic planning takes into account the failure recovery aspects. It is an indisputable fact that the system will, sooner or later, end in some kind of failure condition, given enough time. Therefore, the system development, among other things, must rely on a good recovery strategy. One of the most important steps in a robot workstation planning is the allocation of resources to positions. With the proper layout, system flexibility can be efficiently used to ensure the successful failure recovery. By introduction of some novel recovery parameters, namely, a resource failure rate and a treatment quality, we formulated and presented an optimization criteria that consists of the handling costs and the failure recovery costs. The real contribution of presented cost function is overall consideration of the failure recovery cost with the respect to resource shape, size, overlapping and other important robot workstation aspects. The goal of the optimization is to displace resource with a high rate of failure as close as possible to its recovery option resource, taking into account restrictions related to the robot workspace and resources’ dimensions. We have shown that minimization of the proposed optimization function leads to the system layout that reduces overall distance between resources, thus providing decrease in robot transportation time which finally results in increased OEE. A recovery resolution is the second important aspect of the failure recovery strategy. When it comes to the operational failure, the proposed control recovery method ensures that a favorable failure resolution option is chosen. Compared to the existing failure recovery concepts, the presented approach is an exact and straight resolution method based on the weighted cycle time comparison of possible recovery scenarios. A system augmentation will guarantee that the system will follow specific recovery steps in order to reach an operational state. The failure recovery algorithm, presented in this paper, is implemented in the main workstation controller of a robotic workcell in HS Produkt.