Add summary

Execute each algorithm once
Fix element modification in dataframe list
2021-06-22 15:16:46 +02:00 · 2021-06-22 01:47:11 +02:00 · 2021-06-22 01:10:05 +02:00 · 2021-06-22 00:21:14 +02:00 · 2021-06-21 21:41:00 +02:00 · 2021-06-21 19:51:42 +02:00
8 changed files with 571 additions and 224 deletions
--- a/docs/.gitkeep
+++ b/docs/.gitkeep
--- a/docs/Summary.org
+++ b/docs/Summary.org
@ -0,0 +1,143 @@
 #+TITLE: Práctica 2
 #+SUBTITLE: Metaheurísticas
 #+AUTHOR: Amin Kasrou Aouam
 #+DATE: 2021-06-22
 #+PANDOC_OPTIONS: template:~/.pandoc/templates/eisvogel.latex
 #+PANDOC_OPTIONS: listings:t
 #+PANDOC_OPTIONS: toc:t
 #+PANDOC_METADATA: lang=es
 #+PANDOC_METADATA: titlepage:t
 #+PANDOC_METADATA: listings-no-page-break:t
 #+PANDOC_METADATA: toc-own-page:t
 #+PANDOC_METADATA: table-use-row-colors:t
 #+PANDOC_METADATA: colorlinks:t
 #+PANDOC_METADATA: logo:/home/coolneng/Photos/Logos/UGR.png
 #+LaTeX_HEADER: \usepackage[ruled, lined, linesnumbered, commentsnumbered, longend]{algorithm2e}
 * Práctica 2
 ** Introducción
 En esta práctica, usaremos distintos algoritmos de búsqueda, basados en poblaciones, para resolver el problema de la máxima diversidad (MDP). Implementaremos:
 - Algoritmo genético
 - Algoritmo memético
 ** Algoritmos
 *** Genético
 Los algoritmos genéticos se inspiran en la evolución natural y la genética. Generan un conjunto de soluciones inicial (i.e. población), seleccionan un subconjunto de individuos sobre los cuales se opera, hacen operaciones de recombinación y mutación, y finalmente reemplazan la población anterior por una nueva.
 El procedimiento general del algoritmo queda ilustrado a continuación:
 \begin{algorithm}
    \KwIn{A list $[a_i]$, $i=1, 2, \cdots, n$, that contains the population of individuals}
    \KwOut{Processed list}
    $P(t) \leftarrow initializePopulation()$
    $P(t) \leftarrow evaluatePopulation()$
    \While{$\neg stop  condition $}{
        $t = t + 1$
        $parents \leftarrow selectParents(P(t-1))$
        $offspring \leftarrow recombine(parents)$
        $offspring \leftarrow mutate(offspring)$
        $P(t) \leftarrow replacePopulation(P(t-1), offspring)$
        $P(t) \leftarrow evaluatePopulation()$
    }
    \KwRet{$P(t)$}
 \end{algorithm}
 Procedemos a la implementación de 4 variantes distintas, según 2 criterios:
 **** Criterio de reemplazamiento
 - *Generacional*: la nueva población reemplaza totalmente a la población anterior
 - *Estacionario*: los dos mejores hijos reemplazan los dos peores individuos en la población anterior
 **** Operador de cruce
 - *Uniforme*: mantiene las posiciones comunes de ambos padres, las demás se eligen de forma aleatoria de cada padre (requiere reparador)
 - *Posición*: mantiene las posiciones comunes de ambos padres, elige el resto de elementos de cada padre y los baraja. Genera 2 hijos.
 *** Memético
 Los algoritmos meméticos surgen de la hibridación de un algoritmo genético, con un algoritmo de búsqueda local. El resultado es un algoritmo que posee un buen equilibrio entre exploración y explotación.
 El procedimiento general del algoritmo queda ilustrado a continuación:
 \begin{algorithm}
    \KwIn{A list $[a_i]$, $i=1, 2, \cdots, n$, that contains the population of individuals}
    \KwOut{Processed list}
    $P(t) \leftarrow initializePopulation()$
    $P(t) \leftarrow evaluatePopulation()$
    \While{$\neg stop  condition $}{
        \If{$certain  iteration$}{
            $P(t) <- localSearch(P(t-1))$
        }
        $t = t + 1$
        $parents \leftarrow selectParents(P(t-1))$
        $offspring \leftarrow recombine(parents)$
        $offspring \leftarrow mutate(offspring)$
        $P(t) \leftarrow replacePopulation(P(t-1), offspring)$
        $P(t) \leftarrow evaluatePopulation()$
    }
    \KwRet{$P(t)$}
 \end{algorithm}
 Procedemos a la implementación de 3 variantes distintas:
 - Búsqueda local sobre todos los cromosomas
 - Búsqueda local sobre un subconjunto aleatorio de cromosomas
 - Búsqueda local sobre un el subconjunto de los mejores cromosomas
 ** Implementación
 La práctica ha sido implementada en /Python/, usando las siguientes bibliotecas:
 - NumPy
 - Pandas
 *** Instalación
 Para ejecutar el programa es preciso instalar Python, junto con las bibliotecas *Pandas* y *NumPy*.
 Se proporciona el archivo shell.nix para facilitar la instalación de las dependencias, con el gestor de paquetes [[https://nixos.org/][Nix]]. Tras instalar la herramienta Nix, únicamente habría que ejecutar el siguiente comando en la raíz del proyecto:
 #+begin_src shell
 nix-shell
 #+end_src
 ** Ejecución
 La ejecución del programa se realiza mediante el siguiente comando:
 #+begin_src shell
 python src/main.py <dataset> <algoritmo> <parámetros>
 #+end_src
 Los parámetros posibles son:
 | dataset                              | algoritmo | parámetros                             |
 | Cualquier archivo de la carpeta data | genetic   | uniform/position generation/stationary |
 |                                      | memetic   | all/random/best                        |
 También se proporciona un script que ejecuta 1 iteración de cada algoritmo, sobre cada uno de los /datasets/, y guarda los resultados en una hoja de cálculo. Se puede ejecutar mediante el siguiente comando:
 #+begin_src shell
 python src/execution.py
 #+end_src
 *Nota*: se precisa instalar la biblioteca [[https://xlsxwriter.readthedocs.io/][XlsxWriter]] para la exportación de los resultados a un archivo Excel.
 * Análisis de los resultados
 Desafortunadamente, debido a un tiempo de ejecución excesivamente alto (incluso tras ajustar los metaparámetros) no podemos proporcionar resultados de la ejecución de los algoritmos.
--- a/docs/Summary.pdf
+++ b/docs/Summary.pdf
--- a/src/execution.py
+++ b/src/execution.py
@ -14,70 +14,67 @@ def file_list(path):
 def create_dataframes():
-    greedy = DataFrame()
+    return [DataFrame() for _ in range(7)]
    local = DataFrame()
    return greedy, local
 def process_output(results):
    distances = []
    time = []
-    for element in results:
+    for line in results:
-        for line in element:
+        if line.startswith(bytes("Total distance:", encoding="utf-8")):
-            if line.startswith(bytes("Total distance:", encoding="utf-8")):
+            line_elements = line.split(sep=bytes(":", encoding="utf-8"))
-                line_elements = line.split(sep=bytes(":", encoding="utf-8"))
+            distances.append(float(line_elements[1]))
-                distances.append(float(line_elements[1]))
+        if line.startswith(bytes("Execution time:", encoding="utf-8")):
-            if line.startswith(bytes("Execution time:", encoding="utf-8")):
+            line_elements = line.split(sep=bytes(":", encoding="utf-8"))
-                line_elements = line.split(sep=bytes(":", encoding="utf-8"))
+            time.append(float(line_elements[1]))
                time.append(float(line_elements[1]))
    return distances, time
-def populate_dataframes(greedy, local, greedy_list, local_list, dataset):
+def populate_dataframe(df, output_cmd, dataset):
-    greedy_distances, greedy_time = process_output(greedy_list)
+    distances, time = process_output(output_cmd)
-    local_distances, local_time = process_output(local_list)
+    data_dict = {
    greedy_dict = {
        "dataset": dataset.removeprefix("data/"),
-        "media distancia": mean(greedy_distances),
+        "media distancia": mean(distances),
-        "desviacion distancia": std(greedy_distances),
+        "desviacion distancia": std(distances),
-        "media tiempo": mean(greedy_time),
+        "media tiempo": mean(time),
-        "desviacion tiempo": std(greedy_time),
+        "desviacion tiempo": std(time),
    }
-    local_dict = {
+    df = df.append(data_dict, ignore_index=True)
-        "dataset": dataset.removeprefix("data/"),
+    return df
        "media distancia": mean(local_distances),
        "desviacion distancia": std(local_distances),
        "media tiempo": mean(local_time),
        "desviacion tiempo": std(local_time),
    }
    greedy = greedy.append(greedy_dict, ignore_index=True)
    local = local.append(local_dict, ignore_index=True)
    return greedy, local
-def script_execution(filenames, greedy, local, iterations=3):
+def script_execution(filenames, df_list):
    script = "src/main.py"
    parameters = [
        ["genetic", "uniform", "generational"],
        ["genetic", "position", "generational"],
        ["genetic", "uniform", "stationary"],
        ["genetic", "position", "stationary"],
        ["memetic", "all"],
        ["memetic", "random"],
        ["memetic", "best"],
    ]
    for dataset in filenames:
        print(f"Running on dataset {dataset}")
-        greedy_list = []
+        for index, params in zip(range(4), parameters):
-        local_list = []
+            print(f"Running {params} algorithm")
-        for _ in range(iterations):
+            output_cmd = run(
-            greedy_cmd = run(
+                [executable, script, dataset, *params], capture_output=True
                [executable, script, dataset, "greedy"], capture_output=True
            ).stdout.splitlines()
-            local_cmd = run(
+            df_list[index] = populate_dataframe(df_list[index], output_cmd, dataset)
-                [executable, script, dataset, "local"], capture_output=True
+    return df_list
            ).stdout.splitlines()
            greedy_list.append(greedy_cmd)
            local_list.append(local_cmd)
        greedy, local = populate_dataframes(
            greedy, local, greedy_list, local_list, dataset
        )
    return greedy, local
-def export_results(greedy, local):
+def export_results(df_list):
-    dataframes = {"Greedy": greedy, "Local search": local}
+    dataframes = {
        "Generational uniform genetic": df_list[0],
        "Generational position genetic": df_list[1],
        "Stationary uniform genetic": df_list[2],
        "Stationary position genetic": df_list[3],
        "All genes memetic": df_list[4],
        "Random genes memetic": df_list[5],
        "Best genes memetic": df_list[6],
    }
    writer = ExcelWriter(path="docs/algorithm-results.xlsx", engine="xlsxwriter")
    for name, df in dataframes.items():
        df.to_excel(writer, sheet_name=name, index=False)
@ -91,9 +88,9 @@ def export_results(greedy, local):
 def main():
    datasets = file_list(path="data/*.txt")
-    greedy, local = create_dataframes()
+    df_list = create_dataframes()
-    populated_greedy, populated_local = script_execution(datasets, greedy, local)
+    populated_df_list = script_execution(datasets, df_list)
-    export_results(populated_greedy, populated_local)
+    export_results(populated_df_list)
 if __name__ == "__main__":
--- a/src/genetic_algorithm.py
+++ b/src/genetic_algorithm.py
@ -1,6 +1,11 @@
-from numpy import sum, append, arange, delete, where
+from numpy import intersect1d, array_equal
 from numpy.random import randint, choice, shuffle
 from pandas import DataFrame
 from math import ceil
 from functools import partial
 from multiprocessing import Pool
 from copy import deepcopy
 from itertools import combinations
 def get_row_distance(source, destination, data):
@ -11,148 +16,288 @@ def get_row_distance(source, destination, data):
    return row["distance"].values[0]
-def compute_distance(element, solution, data):
+def compute_distance(element, individual, data):
    accumulator = 0
-    distinct_elements = solution.query(f"point != {element}")
+    distinct_elements = individual.query(f"point != {element}")
    for _, item in distinct_elements.iterrows():
        accumulator += get_row_distance(
-            source=element,
+            source=element, destination=item.point, data=data
            destination=item.point,
            data=data,
        )
    return accumulator
-def generate_first_solution(n, m, data):
+def generate_individual(n, m, data):
-    solution = DataFrame(columns=["point", "distance"])
+    individual = DataFrame(columns=["point", "distance", "fitness"])
-    solution["point"] = choice(n, size=m, replace=False)
+    individual["point"] = choice(n, size=m, replace=False)
-    solution["distance"] = solution["point"].apply(
+    individual["distance"] = individual["point"].apply(
-        func=compute_distance, solution=solution, data=data
+        func=compute_distance, individual=individual, data=data
    )
-    return solution
+    return individual
-def evaluate_element(element, data):
+def evaluate_individual(individual, data):
-    fitness = []
+    fitness = 0
-    genotype = element.point.values
+    comb = combinations(individual.index, r=2)
-    distances = data.query(f"source in @genotype and destination in @genotype")
+    for index in list(comb):
-    for item in genotype[:-1]:
+        elements = individual.loc[index, :]
-        element_df = distances.query(f"source == {item} or destination == {item}")
+        fitness += get_row_distance(
-        max_distance = element_df["distance"].astype(float).max()
+            source=elements["point"].head(n=1).values[0],
-        fitness = append(arr=fitness, values=max_distance)
+            destination=elements["point"].tail(n=1).values[0],
-        distances = distances.query(f"source != {item} and destination != {item}")
+            data=data,
-    return sum(fitness)
+        )
    individual["fitness"] = fitness
    return individual
 def select_distinct_genes(matching_genes, parents, m):
-    cutoff = randint(m)
+    first_parent = parents[0].query("point not in @matching_genes")
-    distinct_indexes = delete(arange(m), matching_genes)
+    second_parent = parents[1].query("point not in @matching_genes")
-    first_parent_genes = parents[0].point.iloc[distinct_indexes[cutoff:]]
+    cutoff = randint(m - len(matching_genes) + 1)
-    second_parent_genes = parents[1].point.iloc[distinct_indexes[:cutoff]]
+    first_parent_genes = first_parent.point.values[cutoff:]
    second_parent_genes = second_parent.point.values[:cutoff]
    return first_parent_genes, second_parent_genes
-def select_random_genes(matching_genes, parents, m):
+def select_shuffled_genes(matching_genes, parents):
-    random_parent = parents[randint(len(parents))]
+    first_parent = parents[0].query("point not in @matching_genes")
-    distinct_indexes = delete(arange(m), matching_genes)
+    second_parent = parents[1].query("point not in @matching_genes")
-    genes = random_parent.point.iloc[distinct_indexes].values
+    first_genes = first_parent.point.values
-    shuffle(genes)
+    second_genes = second_parent.point.values
-    return genes
+    shuffle(first_genes)
    shuffle(second_genes)
    return first_genes, second_genes
 def select_random_parent(parents):
    random_index = randint(len(parents))
    random_parent = parents[random_index]
    if random_parent.point.empty:
        opposite_index = 1 - random_index
        random_parent = parents[opposite_index]
    return random_parent
 def get_best_point(parents, offspring):
    while True:
        random_parent = deepcopy(select_random_parent(parents))
        best_index = random_parent["distance"].idxmax()
        best_point = random_parent["point"].iloc[best_index]
        random_parent.drop(index=best_index, inplace=True)
        if best_point not in offspring.point.values:
            return best_point
 def repair_offspring(offspring, parents, m):
    while len(offspring) != m:
        if len(offspring) > m:
-            best_index = offspring["distance"].astype(float).idxmax()
+            best_index = offspring["distance"].idxmax()
            offspring.drop(index=best_index, inplace=True)
        elif len(offspring) < m:
-            random_parent = parents[randint(len(parents))]
+            best_point = get_best_point(parents, offspring)
            best_index = random_parent["distance"].astype(float).idxmax()
            best_point = random_parent["point"].loc[best_index]
            offspring = offspring.append(
-                {"point": best_point, "distance": 0}, ignore_index=True
+                {"point": best_point, "distance": 0, "fitness": 0}, ignore_index=True
            )
            random_parent.drop(index=best_index, inplace=True)
    return offspring
 def get_matching_genes(parents):
-    first_parent = parents[0].point
+    first_parent = parents[0].point.values
-    second_parent = parents[1].point
+    second_parent = parents[1].point.values
-    return where(first_parent == second_parent)
+    return intersect1d(first_parent, second_parent)
 def populate_offspring(values):
-    offspring = DataFrame(columns=["point", "distance"])
+    offspring = DataFrame(columns=["point", "distance", "fitness"])
    for element in values:
-        aux = DataFrame(columns=["point", "distance"])
+        aux = DataFrame(columns=["point", "distance", "fitness"])
        aux["point"] = element
        offspring = offspring.append(aux)
    offspring["distance"] = 0
-    offspring = offspring[1:]
+    offspring["fitness"] = 0
    return offspring
 def uniform_crossover(parents, m):
-    matching_indexes = get_matching_genes(parents)
+    matching_genes = get_matching_genes(parents)
    matching_genes = parents[0].point.iloc[matching_indexes]
    first_genes, second_genes = select_distinct_genes(matching_genes, parents, m)
    offspring = populate_offspring(values=[matching_genes, first_genes, second_genes])
    viable_offspring = repair_offspring(offspring, parents, m)
    return viable_offspring
-def position_crossover(parents, m):
+def position_crossover(parents):
    matching_genes = get_matching_genes(parents)
-    shuffled_genes = select_random_genes(matching_genes, parents, m)
+    first_genes, second_genes = select_shuffled_genes(matching_genes, parents)
-    offspring = populate_offspring(values=[matching_genes, shuffled_genes])
+    first_offspring = populate_offspring(values=[matching_genes, first_genes])
    second_offspring = populate_offspring(values=[matching_genes, second_genes])
    return first_offspring, second_offspring
 def group_parents(parents):
    parent_pairs = []
    for i in range(0, len(parents), 2):
        first = parents[i]
        second = parents[i + 1]
        if array_equal(first.point.values, second.point.values):
            random_index = randint(i + 1)
            second, parents[random_index] = parents[random_index], second
        parent_pairs.append([first, second])
    return parent_pairs
 def crossover(mode, parents, m, probability=0.7):
    parent_groups = group_parents(parents)
    offspring = []
    if mode == "uniform":
        expected_crossovers = int(len(parents) * probability)
        cutoff = expected_crossovers // 2
        for element in parent_groups[:cutoff]:
            offspring.append(uniform_crossover(element, m))
            offspring.append(uniform_crossover(element, m))
        for element in parent_groups[cutoff:]:
            offspring.append(element[0])
            offspring.append(element[1])
    else:
        for element in parent_groups:
            first_offspring, second_offspring = position_crossover(element)
            offspring.append(first_offspring)
            offspring.append(second_offspring)
    return offspring
-def crossover(mode, parents, m):
+def element_in_dataframe(individual, element):
-    if mode == "uniform":
+    duplicates = individual.query(f"point == {element}")
        return uniform_crossover(parents, m)
    return position_crossover(parents, m)
 def element_in_dataframe(solution, element):
    duplicates = solution.query(f"point == {element}")
    return not duplicates.empty
-def replace_worst_element(previous, n, data):
+def select_new_gene(individual, n):
-    solution = previous.copy()
+    while True:
-    worst_index = solution["distance"].astype(float).idxmin()
+        new_gene = randint(n)
-    random_element = randint(n)
+        if not element_in_dataframe(individual=individual, element=new_gene):
-    while element_in_dataframe(solution=solution, element=random_element):
+            return new_gene
        random_element = randint(n)
    solution["point"].loc[worst_index] = random_element
    solution["distance"].loc[worst_index] = compute_distance(
        element=solution["point"].loc[worst_index], solution=solution, data=data
    )
    return solution
-def get_random_solution(previous, n, data):
+def mutate(offspring, n, data, probability=0.001):
-    solution = replace_worst_element(previous, n, data)
+    expected_mutations = len(offspring) * n * probability
-    while solution["distance"].sum() <= previous["distance"].sum():
+    individuals = []
-        solution = replace_worst_element(previous=solution, n=n, data=data)
+    genes = []
-    return solution
+    for _ in range(ceil(expected_mutations)):
        individuals.append(randint(len(offspring)))
        current_individual = individuals[-1]
        genes.append(offspring[current_individual].sample().index)
    for ind, gen in zip(individuals, genes):
        individual = offspring[ind]
        individual["point"].iloc[gen] = select_new_gene(individual, n)
        individual["distance"].iloc[gen] = compute_distance(
            element=individual["point"].iloc[gen].values[0],
            individual=individual,
            data=data,
        )
    return offspring
-def explore_neighbourhood(element, n, data, max_iterations=100000):
+def get_individual_index(element, population):
-    neighbourhood = []
+    for index in range(len(population)):
-    neighbourhood.append(element)
+        if population[index].fitness.values[0] == element.fitness.values[0]:
            return index
 def tournament_selection(population):
    individuals = [population[randint(len(population))] for _ in range(2)]
    best_element = max(individuals, key=lambda x: x.fitness.values[0])
    population_index = get_individual_index(best_element, population)
    return best_element, population_index
 def check_element_population(element, population):
    for item in population:
        if array_equal(element.point.values, item.point.values):
            return True
    return False
 def generational_replacement(prev_population, current_population):
    new_population = current_population
    best_previous_individual = max(prev_population, key=lambda x: x.fitness.values[0])
    if check_element_population(best_previous_individual, new_population):
        worst_element = min(new_population, key=lambda x: x.fitness.values[0])
        worst_index = get_individual_index(worst_element, new_population)
        new_population[worst_index] = best_previous_individual
    return new_population
 def get_best_elements(population):
    select_population = deepcopy(population)
    first_element = max(select_population, key=lambda x: x.fitness.values[0])
    first_index = get_individual_index(first_element, select_population)
    select_population.pop(first_index)
    second_element = max(select_population, key=lambda x: x.fitness.values[0])
    second_index = get_individual_index(second_element, select_population)
    return first_index, second_index
 def get_worst_elements(population):
    select_population = deepcopy(population)
    first_element = min(select_population, key=lambda x: x.fitness.values[0])
    first_index = get_individual_index(first_element, select_population)
    select_population.pop(first_index)
    second_element = min(select_population, key=lambda x: x.fitness.values[0])
    second_index = get_individual_index(second_element, select_population)
    return first_index, second_index
 def stationary_replacement(prev_population, current_population):
    new_population = prev_population
    first_worst, second_worst = get_worst_elements(prev_population)
    first_best, second_best = get_best_elements(current_population)
    worst_indexes = [first_worst, second_worst]
    best_indexes = [first_best, second_best]
    for worst, best in zip(worst_indexes, best_indexes):
        if (
            current_population[best].fitness.values[0]
            > prev_population[worst].fitness.values[0]
        ):
            new_population[worst] = current_population[best]
    return new_population
 def replace_population(prev_population, current_population, mode):
    if mode == "generational":
        return generational_replacement(prev_population, current_population)
    return stationary_replacement(prev_population, current_population)
 def evaluate_population(population, data, cores=4):
    fitness_func = partial(evaluate_individual, data=data)
    with Pool(cores) as pool:
        evaluated_population = pool.map(fitness_func, population)
    return evaluated_population
 def select_parents(population, n, mode):
    select_population = deepcopy(population)
    parents = []
    if mode == "generational":
        for _ in range(n):
            element, index = tournament_selection(population=select_population)
            parents.append(element)
            select_population.pop(index)
    else:
        for _ in range(2):
            element, index = tournament_selection(population=select_population)
            parents.append(element)
            select_population.pop(index)
    return parents
 def genetic_algorithm(n, m, data, select_mode, crossover_mode, max_iterations=100000):
    population = [generate_individual(n, m, data) for _ in range(n)]
    population = evaluate_population(population, data)
    for _ in range(max_iterations):
-        previous_solution = neighbourhood[-1]
+        parents = select_parents(population, n, select_mode)
-        neighbour = get_random_solution(previous=previous_solution, n=n, data=data)
+        offspring = crossover(crossover_mode, parents, m)
-        neighbourhood.append(neighbour)
+        offspring = mutate(offspring, n, data)
-    return neighbour
+        population = replace_population(population, offspring, select_mode)
-
+        population = evaluate_population(population, data)
-
+    best_index, _ = get_best_elements(population)
-def genetic_algorithm(n, m, data):
+    return population[best_index]
    first_solution = generate_first_solution(n, m, data)
    best_solution = explore_neighbourhood(
        element=first_solution, n=n, data=data, max_iterations=100
    )
    return best_solution
--- a/src/local_search.py
+++ b/src/local_search.py
@ -0,0 +1,64 @@
 from numpy.random import choice, seed, randint
 from pandas import DataFrame
 def get_row_distance(source, destination, data):
    row = data.query(
        """(source == @source and destination == @destination) or \
        (source == @destination and destination == @source)"""
    )
    return row["distance"].values[0]
 def compute_distance(element, solution, data):
    accumulator = 0
    distinct_elements = solution.query(f"point != {element}")
    for _, item in distinct_elements.iterrows():
        accumulator += get_row_distance(
            source=element,
            destination=item.point,
            data=data,
        )
    return accumulator
 def element_in_dataframe(solution, element):
    duplicates = solution.query(f"point == {element}")
    return not duplicates.empty
 def replace_worst_element(previous, n, data):
    solution = previous.copy()
    worst_index = solution["distance"].astype(float).idxmin()
    random_element = randint(n)
    while element_in_dataframe(solution=solution, element=random_element):
        random_element = randint(n)
    solution["point"].loc[worst_index] = random_element
    solution["distance"].loc[worst_index] = compute_distance(
        element=solution["point"].loc[worst_index], solution=solution, data=data
    )
    return solution
 def get_random_solution(previous, n, data):
    solution = replace_worst_element(previous, n, data)
    while solution["distance"].sum() <= previous["distance"].sum():
        solution = replace_worst_element(previous=solution, n=n, data=data)
    return solution
 def explore_neighbourhood(element, n, data, max_iterations=100000):
    neighbourhood = []
    neighbourhood.append(element)
    for _ in range(max_iterations):
        previous_solution = neighbourhood[-1]
        neighbour = get_random_solution(previous=previous_solution, n=n, data=data)
        neighbourhood.append(neighbour)
    return neighbour
 def local_search(first_solution, n, data):
    best_solution = explore_neighbourhood(
        element=first_solution, n=n, data=data, max_iterations=5
    )
    return best_solution
--- a/src/main.py
+++ b/src/main.py
@ -1,68 +1,57 @@
 from preprocessing import parse_file
 from genetic_algorithm import genetic_algorithm
 from memetic_algorithm import memetic_algorithm
 from sys import argv
 from time import time
-from itertools import combinations
+from argparse import ArgumentParser
-def execute_algorithm(choice, n, m, data):
+def execute_algorithm(args, n, m, data):
-    if choice == "genetic":
+    if args.algorithm == "genetic":
-        return genetic_algorithm(n, m, data)
+        return genetic_algorithm(
-    elif choice == "memetic":
+            n,
-        return memetic_algorithm(m, data)
+            m,
-    else:
+            data,
-        print("The valid algorithm choices are 'genetic' and 'memetic'")
+            select_mode=args.selection,
-        exit(1)
+            crossover_mode=args.crossover,
-
+            max_iterations=100,
 def get_row_distance(source, destination, data):
    row = data.query(
        """(source == @source and destination == @destination) or \
        (source == @destination and destination == @source)"""
    )
    return row["distance"].values[0]
 def get_fitness(solutions, data):
    counter = 0
    comb = combinations(solutions.index, r=2)
    for index in list(comb):
        elements = solutions.loc[index, :]
        counter += get_row_distance(
            source=elements["point"].head(n=1).values[0],
            destination=elements["point"].tail(n=1).values[0],
            data=data,
        )
-    return counter
+    return memetic_algorithm(
        n,
        m,
        data,
        hybridation=args.hybridation,
        max_iterations=100,
    )
-def show_results(solutions, fitness, time_delta):
+def show_results(solution, time_delta):
-    duplicates = solutions.duplicated().any()
+    duplicates = solution.duplicated().any()
-    print(solutions)
+    print(solution)
-    print(f"Total distance: {fitness}")
+    print(f"Total distance: {solution.fitness.values[0]}")
    if not duplicates:
        print("No duplicates found")
    print(f"Execution time: {time_delta}")
-def usage(argv):
+def parse_arguments():
-    print(f"Usage: python {argv[0]} <file> <algorithm choice>")
+    parser = ArgumentParser()
-    print("algorithm choices:")
+    parser.add_argument("file", help="dataset of choice")
-    print("genetic: genetic algorithm")
+    subparsers = parser.add_subparsers(dest="algorithm")
-    print("memetic: memetic algorithm")
+    parser_genetic = subparsers.add_parser("genetic")
-    exit(1)
+    parser_memetic = subparsers.add_parser("memetic")
    parser_genetic.add_argument("crossover", choices=["uniform", "position"])
    parser_genetic.add_argument("selection", choices=["generational", "stationary"])
    parser_memetic.add_argument("hybridation", choices=["all", "random", "best"])
    return parser.parse_args()
 def main():
-    if len(argv) != 3:
+    args = parse_arguments()
-        usage(argv)
+    n, m, data = parse_file(args.file)
    n, m, data = parse_file(argv[1])
    start_time = time()
-    solutions = execute_algorithm(choice=argv[2], n=n, m=m, data=data)
+    solutions = execute_algorithm(args, n, m, data)
    end_time = time()
-    fitness = get_fitness(solutions, data)
+    show_results(solutions, time_delta=end_time - start_time)
    show_results(solutions, fitness, time_delta=end_time - start_time)
 if __name__ == "__main__":
--- a/src/memetic_algorithm.py
+++ b/src/memetic_algorithm.py
@ -1,50 +1,59 @@
-from numpy.random import choice, seed
+from genetic_algorithm import *
 from local_search import local_search
 from copy import deepcopy
-def get_first_random_solution(m, data):
+def get_best_indices(n, population):
-    seed(42)
+    select_population = deepcopy(population)
-    random_indexes = choice(len(data.index), size=m, replace=False)
+    best_elements = []
-    return data.loc[random_indexes]
+    for _ in range(n):
        best_index, _ = get_best_elements(select_population)
        best_elements.append(best_index)
        select_population.pop(best_index)
    return best_elements
-def element_in_dataframe(solution, element):
+def replace_elements(current_population, new_population, indices):
-    duplicates = solution.query(
+    for item in indices:
-        f"(source == {element.source} and destination == {element.destination}) or (source == {element.destination} and destination == {element.source})"
+        current_population[item] = new_population[item]
-    )
+    return current_population
    return not duplicates.empty
-def replace_worst_element(previous, data):
+def run_local_search(n, data, population, mode, probability=0.1):
    solution = previous.copy()
    worst_index = solution["distance"].astype(float).idxmin()
    random_element = data.sample().squeeze()
    while element_in_dataframe(solution=solution, element=random_element):
        random_element = data.sample().squeeze()
    solution.loc[worst_index] = random_element
    return solution, worst_index
 def get_random_solution(previous, data):
    solution, worst_index = replace_worst_element(previous, data)
    previous_worst_distance = previous["distance"].loc[worst_index]
    while solution.distance.loc[worst_index] <= previous_worst_distance:
        solution, _ = replace_worst_element(previous=solution, data=data)
    return solution
 def explore_neighbourhood(element, data, max_iterations=100000):
    neighbourhood = []
-    neighbourhood.append(element)
+    if mode == "all":
-    for _ in range(max_iterations):
+        for individual in population:
-        previous_solution = neighbourhood[-1]
+            neighbourhood.append(local_search(individual, n, data))
-        neighbour = get_random_solution(previous=previous_solution, data=data)
+        new_population = neighbourhood
-        neighbourhood.append(neighbour)
+    elif mode == "random":
-    return neighbour
+        expected_individuals = len(population) * probability
        indices = []
        for _ in range(expected_individuals):
            random_index = randint(len(population))
            random_individual = population[random_index]
            neighbourhood.append(local_search(random_individual, n, data))
            indices.append(random_index)
        new_population = replace_elements(population, neighbourhood, indices)
    else:
        expected_individuals = len(population) * probability
        best_indices = get_best_indices(n=expected_individuals, population=population)
        for element in best_indices:
            neighbourhood.append(local_search(population[element], n, data))
        new_population = replace_elements(population, neighbourhood, best_indices)
    return new_population
-def memetic_algorithm(m, data):
+def memetic_algorithm(n, m, data, hybridation, max_iterations=100000):
-    first_solution = get_first_random_solution(m=m, data=data)
+    population = [generate_individual(n, m, data) for _ in range(n)]
-    best_solution = explore_neighbourhood(
+    population = evaluate_population(population, data)
-        element=first_solution, data=data, max_iterations=100
+    for i in range(max_iterations):
-    )
+        if i % 10 == 0:
-    return best_solution
+            population = run_local_search(n, data, population, mode=hybridation)
            i += 5
        parents = select_parents(population, n, mode="stationary")
        offspring = crossover(mode="position", parents=parents, m=m)
        offspring = mutate(offspring, n, data)
        population = replace_population(population, offspring, mode="stationary")
        population = evaluate_population(population, data)
    best_index, _ = get_best_elements(population)
    return population[best_index]
Author	SHA1	Message	Date
coolneng	c6118e2d86	Add summary	2021-06-22 15:16:46 +02:00
coolneng	03afe1a00f	Execute each algorithm once	2021-06-22 01:47:11 +02:00
coolneng	1aafc9bdda	Fix element modification in dataframe list	2021-06-22 01:10:05 +02:00
coolneng	fb5e9dc703	Remove redundant code	2021-06-22 00:21:14 +02:00
coolneng	112f40d00f	Replace the population after the local search	2021-06-21 21:41:00 +02:00
coolneng	32eac42e7b	Take into account local search iterations	2021-06-21 19:51:42 +02:00
coolneng	f61cb7002e	Implement memetic algorithm	2021-06-21 18:22:38 +02:00
coolneng	9aeff47bb1	Fix parameters typos	2021-06-21 17:57:48 +02:00
coolneng	764d235b4d	Rename populate dataframes to populate dataframe	2021-06-21 17:56:52 +02:00
coolneng	7cbb25c546	Add local search module	2021-06-21 17:54:37 +02:00
coolneng	20aa6b2d1e	Refactor execution script	2021-06-21 17:54:26 +02:00
coolneng	4e640ffc2d	Add memetic algorithm prototype	2021-06-21 07:39:51 +02:00
coolneng	ab4748d28e	Change fitness evaluation	2021-06-21 07:39:39 +02:00
coolneng	c2cc3c716d	Limit number of iterations to 100	2021-06-21 03:50:14 +02:00
coolneng	e20e16d476	Clean up genetic algorithm	2021-06-21 03:48:50 +02:00
coolneng	924e4c9638	Change CLI using argparse	2021-06-21 03:46:35 +02:00
coolneng	f4dd4700c7	Add crossover probability	2021-06-21 02:25:54 +02:00
coolneng	35ca73ba74	Refactor parent grouping making it more resilient	2021-06-21 02:08:27 +02:00
coolneng	48737fd6f0	Refactor best point selection	2021-06-21 00:47:48 +02:00
coolneng	efd511e070	Fix populate offspring	2021-06-20 19:48:04 +02:00
coolneng	5f89731f77	Fix uniform crossover function	2021-06-20 18:04:23 +02:00
coolneng	046b1a043a	Implement uniform generational genetic algorithm	2021-06-20 05:24:56 +02:00
coolneng	b4a299cbf7	Fix parent selection	2021-06-20 04:54:58 +02:00
coolneng	150457cc8e	Fix matching genes selection	2021-06-19 20:22:39 +02:00
coolneng	f71aa2e1e2	Fix uniform crossover operator	2021-06-19 19:13:14 +02:00
coolneng	04fd66425e	Fix max and min selection according to fitness	2021-06-18 20:06:59 +02:00
coolneng	ccf3a18a59	Implement population selection	2021-06-18 19:33:26 +02:00
coolneng	c450d65870	Add population evaluation with multiprocessing	2021-06-18 18:54:34 +02:00
coolneng	04719dd8bc	Implement the stationary replacement operator	2021-06-17 23:03:03 +02:00
coolneng	7056534872	Implement the generational replacement operator	2021-06-17 22:45:59 +02:00
coolneng	135d1c48b8	Add fitness column to individual	2021-06-17 22:45:42 +02:00
coolneng	7c434fb9cd	Return 2 offsprings in the position crossover	2021-06-17 22:45:14 +02:00
coolneng	49d8383133	Replace solution instances with individual	2021-06-17 22:44:39 +02:00
coolneng	ac300129ce	Remove deprecated code	2021-06-17 19:25:16 +02:00
coolneng	ed41333e87	Implement mutation operator	2021-06-17 19:15:50 +02:00
coolneng	9cafdc0536	Implement binary tournament selection operator	2021-05-31 18:24:20 +02:00
coolneng	84a165ea6f	Implement mutation operator	2021-05-31 18:12:23 +02:00