In the modern competitive business world, many companies need to modify their business process designs to become more competitive in the enterprise market. They focus their main attention on the optimization and continuous improvement of their processes.
Business Process Management (BPM) and process mining allow companies to analyze and identify possible process improvements through the use of a set of techniques and tools. This research focuses on the last stage of the two disciplines related to the optimization and continuous improvement of business processes, since they offer new alternatives for the optimal performance and control of the processes that are presented daily in companies.
Optimization is the process of finding the best possible solution to a given problem. It can be seen as the search for the values of decision variables for which a certain objective function achieves its extreme value (Chong and Zak, 2004CHONG, K.; ZAK, H.S.: An introduction to optimization, Ed. John Wiley & Sons, New York, USA, 2004, ISBN: 0-471-65400-0., 2013CHONG, E.; ZAK, S.: “An introduction to optimization”, En: An introduction to optimization 4th ed., Ed. John Wiley & Sons, second ed., New York, USA, p. 15, 2013.). Multiobjective optimization applied to business processes can be a good option to improve them since more than one optimization criterion can be selected and satisfied simultaneously.
The Industrial Fishing Company "Camilo Cienfuegos" located in Batabanó Municipality, Mayabeque Province has as objective the capture, industrialization and commercialization of fresh or frozen species of the platform, being the lobster its main exportable line. The constant increase of information to be analyzed in this company brings with it the need to employ analytical techniques for the extraction of knowledge in large volumes of data, necessary to diagnose problems and identify possible areas of improvement in the process. Its use would allow the implementation of a strategy to optimize the production processes of lobster as the key to reduce their production costs and expand in their environment.
This supports the interest of this research in the need that currently exists in the company to have computer tools that allow them to optimize their production processes of lobster, taking into account several objectives such as production cost and completeness of the process.
Business Process Management enables companies to manage their business processes more efficiently by using methods, techniques and tools created to support the design, improvement, management and analysis of these processes (Van der Aalst, 2011VAN DER AALST, W.: “Using process mining to bridge the gap between BI and BPM”, Computer, (12): 77-80, 2011, ISSN: 0018-9162., 2013VAN DER AALST, W.: Using Process Mining to Bridge the Gap between BI and BPM, Inst. IEEE Computer Society, 2013.; Van Der Aalst et al., 2011VAN DER AALST, W.: “Using process mining to bridge the gap between BI and BPM”, Computer, (12): 77-80, 2011, ISSN: 0018-9162.). Process mining is a discipline that aims to discover, monitor and improve processes through the extraction of knowledge from the event records available in current information systems (Van der Aalst, 2011VAN DER AALST, W.: “Using process mining to bridge the gap between BI and BPM”, Computer, (12): 77-80, 2011, ISSN: 0018-9162.; Van Der Aalst et al., 2011VAN DER AALST, W.; ADRIANSYAH, A.; DE MEDEIROS, A.A.K.; ARCIERI, F.; BAIER, T.; BLICKLE, T.; BOSE, J.C.; AGUIAR, P.K.; ABRAHÃO, R.F.; BUIJS, J.: “Process mining manifesto”, En: International Conference on Business Process Management, Ed. Springer, Germany, pp. 169-194, 2011.; Van der Aalst et al., 2011VAN DER AALST, W.; ADRIANSYAH, A.; MEDEIROS, A.F.: Process Mining Manifiesto, Ed. Springer-Verlag, Business Process Management Workshops ed., vol. 99, Springer-Verlag, Germany, 5-20 p., 2011.)..
To support these disciplines, different standards have been created that include symbolic notations for the definition of business processes, such as the Business Process Management Notation (BPMN) by Ter Hofstede and Weske (2003)TER HOFSTEDE, A.H.M.; WESKE, M.: “Business process management: A survey”, [en línea], En: Proceedings of the 1st International Conference on Business Process Management , volume 2678 of LNCS, Ed. Citeseer, 2003, Disponible en:http://www.omg.org/spec/BPMN/2.0/PDF/ , [Consulta: 24 de abril de 2018]. and the Petri dish networks Murata (1989)MURATA, T.: “Petri nets: Properties, analysis and applications”, Proceedings of the IEEE, 77(4): 541-580, 1989, ISSN: 0018-9219.. In this research, BPMN is used to model the production process of the pre-cooked whole lobster using the Bonita Studio tool and then, the BPMN process is converted to the mathematical model Petri network, to which the multiobjective optimization will be applied to improve the process taking into account two objectives at the same time.
Multitarget optimization (MOP) problems are separated from conventional single-target optimization, as the former usually does not deliver a single solution. Instead, MOP generates a set of possible solutions from which decision makers must select which to adopt, based on an evaluation of the performance of the solution across all objectives (Miettinen, 2008MIETTINEN, K.: “Introduction to multiobjective optimization: Noninteractive approaches”, En: Multiobjective optimization, Ed. Springer, Springer Berlin Heidelberg ed., vol. 52, Berlin, Germany, pp. 1-26, 2008.).
In a multiobjective problem, it is possible to choose as many objectives as the business analyst (user) wishes, but in this work, it is necessary to limit the problem to be studied to the optimization of objectives: production cost and completeness of the process to be optimized. The problem is formulated as follows:
Where:
is the function objective of production cost of the process;
is the function objective of completeness of the process.
The function objective of production cost of the process is the sum of the cost of each transition, i.e., the activity in the process; triggered during the parsing of the traces. Only if a transition triggered by this way does not have the necessary tokens for its triggering then its cost will not be added to the total cost. This ensures that the poorer a solution (as calculated by the completeness) for event registering, the lower its cost may be.
The objective completeness function of the process is represented by its model, so a model will be more complete to the extent that it is capable of processing a larger number of traces in the event log. One of the basic ways to obtain completeness would be to divide the number of traces executed correctly by the total number of traces in the event log. The completeness formula proposed by De Medeiros et al. (2007)DE MEDEIROS, A.A.K.; WEIJTERS, J.M.M.A.; VAN DER AALST, M.P.W.: “Genetic process mining: an experimental evaluation”, Data Mining and Knowledge Discovery, 14(2): 245-304, 2007, ISSN: 1384-5810. was taken into account for the development of the computer application of this research, which is described below:
Where:
traces,
causal matrix (coding selected to represent each individual or solution),
correctly executed activities,
total of events in the event log (XES of the process),
total of tokens needed to trigger an activity (transition in the Petri network) but that were not in the entry square of that activity,
total of traces in the event log,
number of traces where tokens were lost,
number of tokens left in the network when the execution of each trace was finished,
total of traces in the event log,
number of traces where tokens were left during its execution.
Evolutionary Algorithms refer to search and optimization techniques inspired by the model of evolution proposed by Darwin (1859)DARWIN, C.R.: On the Origin of Species by Means of Natural Selection Or the Preservation of Favoured Races in the Struggle for Life., Ed. H. Milford; Oxford University Press, New York, USA, 1859.. According to Back (1996)BACK, T.: Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms, Ed. Oxford university press, New York, USA, 1996, ISBN: 0-19-535670-5., they are methods of optimization and search for solutions based on the postulates of biological evolution. In them, a group called population is maintained, whose elements represent possible solutions, which are mixed, and compete with each other, in such a way that the most suitable ones are able to prevail over time, evolving towards better solutions. The population, in the context of evolutionary computing in a general way, refers to a set of possible solutions (feasible solutions) of the problem to be solved.
There are several types of evolutionary algorithms, among them, the most prominent are genetic algorithms and these have proven to be general, robust and powerful tools. In this research, genetic algorithms are used for multi-target optimization because they are less susceptible to the shape and continuity of the Pareto frontier, require little information from the domain and are relatively easy to use and implement. The multi-target genetic algorithms considered for this research are the NSGAII proposed by Zitzler et al. (2001)ZITZLER, E.; LAUMANNS, M.; THIELE, L.: “SPEA2: Improving the strength Pareto evolutionary algorithm”, ComputerEngineering and Networks Laboratory (TIK)-report, 103, 2001. and Deb et al. (2002)DEB, K.; PRATAP, A.; AGARWAL, S.; MEYARIVAN, T.: “A fast and elitist multiobjective genetic algorithm: NSGA-II”, IEEE transactions on evolutionary computation, 6(2): 182-197, 2002, ISSN: 1089-778X., since these are the most representative algorithms to solve multi-target optimization problems.
The NSGAII uses a rapid procedure to organize the population by non-dominance, an approach to preserve elitism, and a non-nichemical operator to disperse the individuals on the Pareto border. In this algorithm, the descendant population (size ) is created in first instance using the parent population (size ). After this, the two populations are combined to form of size . After that, by means of a non-dominated sorting, the population is classified in different Pareto fronts. Once the non-dominated sorting process has been completed, the new population is generated from the configurations of the non-dominated fronts. This new population starts to be built with the best non-dominated front , continues with the solutions of the second front , third and so on. Since the population is in size, and there are only configurations that make up the descendant population, not all the configurations of the fronts belonging to the population will be able to be accommodated in the new population. Those fronts that cannot be accommodated disappear (Deb et al., 2002DEB, K.; PRATAP, A.; AGARWAL, S.; MEYARIVAN, T.: “A fast and elitist multiobjective genetic algorithm: NSGA-II”, IEEE transactions on evolutionary computation, 6(2): 182-197, 2002, ISSN: 1089-778X.).
The NSGAII uses a rapid procedure to organize the population by non-dominance, an approach to preserve elitism, and a non-nichemical operator to disperse the individuals on the Pareto border. In this algorithm, the descendant population (size ) is created in first instance using the parent population (size ). After this, the two populations are combined to form of size 2N. After this, by means of a non-dominated arrangement, the population is classified in different Pareto fronts.
The SPEA2 of Zitzler et al. (2001)ZITZLER, E.; LAUMANNS, M.; THIELE, L.: “SPEA2: Improving the strength Pareto evolutionary algorithm”, ComputerEngineering and Networks Laboratory (TIK)-report, 103, 2001., focuses on improving the skill allocation, parent selection, truncation operator and setting the external file size for all generations. In this, the skill allocation function is improved by taking into account for each individual, the number of individuals it dominates and the number of individuals that it is dominated by. This scheme also adds an estimate of population density. The size of the external population (used for elitism) is fixed, unlike the SPEA, in which the size is variable but limited. is made up only of non-dominated individuals as long as the number of these is greater than or equal to . In the case where the number of non-dominated individuals is less than , dominated individuals are included within until the size of is equal to a .
The clustering technique, which is responsible for maintaining the diversity of the population in SPEA, is replaced by a truncation method, which avoids eliminating the extreme solutions from the set of non-dominated solutions. The selection is made by means of a binary tournament, taking as a criterion of comparison the fitness of each one of the individuals. SPEA2 assumes fitness minimization; therefore, the individual with the lowest fitness value wins the tournament (Zitzler et al., 2001ZITZLER, E.; LAUMANNS, M.; THIELE, L.: “SPEA2: Improving the strength Pareto evolutionary algorithm”, ComputerEngineering and Networks Laboratory (TIK)-report, 103, 2001.).
The following technologies and tools were used to implement the computer application to optimize the process of Pre-Cooked Whole Lobster Production:
Java as the programming language and Netbeans as the development environment (Peñarrieta, 2017PEÑARRIETA, R.: Programacion Java y Netbeans, 2017.).
JMetal as a framework proposed by Nebro & Durillo (2014)NEBRO, A.; DURILLO, J.: jMetal 4.5 User Manual, jMetal, 2014., for the implementation of the NSGAII and SPEA2 algorithms.
Bonita Studio modeling tool available in Castillo (2011)CASTILLO, A.P.A.: BONITA SOFT: Gestor de procesos de negocios BPM, [en línea], Inst. Universidad Nacional de Colombia, 2011, Disponible en:http://www.bonitasoft.com , [Consulta: 6 de abril de 2018]., for the modeling of the process through BPMN notation. Yasper for the import of the process in BPMN format and convert it into a Petri network, exporting the process in the PNML file that is used to perform the optimization through the computer application.
The computer application implemented consists of optimizing the production process of the pre-cooked whole lobster taking into account the production cost and completeness of it. To do this, it is necessary to have the process in a Petri Network and from its representation in the PNML file and its event log in its XES file, the process is loaded and the input parameters to be considered for each of the algorithms to be used to carry out the optimization are configured.
For the process under study, the following input parameters were taken into account for both algorithms: the size of the initial population (10), the number of evaluations (1000), the crossover factor (90%), the mutation factor (50%) and in the case of the SPEA2 algorithm a new parameter is required which is the size of the file (10). Figure 1 shows the application after loading the process and setting the input parameters for the NSGAII algorithm and Figure 2 for the SPEA2 algorithm.
Once the input parameters have been configured, the Optimize option is performed and then Figure 3 shows the set of optimal solutions for the optimized process according to the selected algorithm. In this case, the possible solutions are shown using the NSGAII algorithm. From the set of optimal solutions proposed, the one that is desired to be visualized can be selected and the optimized process design is shown according to the selected solution and the legend of the activities eliminated for that design are shown to the user. Then, the business analyst (user) is in charge of choosing the best one to apply to the process performance.
For the evaluation of the algorithms, the same input parameters were used and 5 runs were made for each of them. To establish a preliminary comparison between the two, four metrics were taken into account: the generation of non-dominated vectors, the ratio of non-dominated vector generation, the actual generation of non-dominated vectors and the generation distance used in Duarte (2001)DUARTE, S.F.: optimización multiobjetivo, Asunción, Paraguay, 2001.. Applying these metrics to the results of the algorithms used in this work, the results shown in Tables 1 and 2 were obtained.
Analyzing the data of Tables 1 and 2, it can be observed that, for both algorithms, the value of the non-dominated vector generation metric is 10 for each of the executions; this is due to the size of the initial population in both algorithms and, in the case of SPEA2 algorithm, by the size of the file used. In the non-dominated vector generation ratio metric, both algorithms present the same value (0.4347) because they depend on the previous metric. Both metrics do not show feasible results to compare both algorithms because there are no differences between them. However, with the real generation of non-dominated vectors metric, it is different. It is clear how the SPEA2 algorithm manages to find more solutions in the optimal Pareto front. And this statement is demonstrated by the generation distance between and . For the SPEA2 algorithm executions, the generation distance is lower in average than for the NSGAII algorithm executions.
In this case, it is possible to conclude that the SPEA2 algorithm obtains better solutions than the NSGAII algorithm, although, the SPEA2 algorithm consumes, on average, more time to find the solutions than the NSGAII algorithm. However, it is left to the user's consideration to select the most optimal possible solutions offered since it is the user who knows which alternative or objective could improve the process of producing pre-cooked whole lobster at the company in Batabanó.
The development of this work allowed the implementation of a computer application through which the process of the Production of Pre-Cooked Whole Lobster was optimized, facilitating the business analysts to obtain models of the optimized process and, thus, achieving its better performance in the Industrial Fishing Company "Camilo Cienfuegos" of Batabanó. The NSGAII and SPEA2 algorithms used in the optimization of the process were evaluated taking into account the metrics of generation of non-dominant vectors, the ratio of generation of non-dominant vectors, the real generation of non-dominant vectors and the generation distance. These showed that the SPEA2 algorithm manages to find better solutions on the optimal Pareto front, but sacrificing a little more runtime compared to the NSGAII algorithm.