Changeset 1224 for framspy/dissimilarity


Ignore:
Timestamp:
04/18/23 01:59:59 (22 months ago)
Author:
Maciej Komosinski
Message:

Less ambiguous names of variables, improved docs

File:
1 edited

Legend:

Unmodified
Added
Removed
  • framspy/dissimilarity/density-distribution.py

    r1223 r1224  
    66
    77class DensityDistribution:
    8     """ Dissimilarity measures based on the distribution. The structure's bounding box is divided into equal-sized cuboids, the number of which depends on the 'steps' parameter. Then the structure's surface is covered with points, the density of the surface's sampling depends on the 'density' parameter. There are two versions of the measure. In the default version ('frequency'=False) a signature is computed as centroids and a number of samples in each cuboid. In the 'frequency' version FFT is computed from the vector containing the number of samples in each cuboid. The distance between signatures can be computed using EMD, L1, or L2 norms.
     8    """Two dissimilarity measures based on the spatial distribution of two Models. The Model bounding box is divided into a grid of equally-sized cuboids, the number of which is the 'resolution' parameter cubed. Then the Model surface is covered with points; the density of the surface sampling is determined by the 'density' parameter. There are two versions of the measure. In the default version ('frequency'=False), a signature of each cuboid is the centroid and the number of samples. In the 'frequency'=True version, FFT is computed from the vector containing the number of samples in each cuboid. The final result of the dissimilarity measure is the distance between the signatures and it can be computed using EMD, L1, or L2 norms (the 'metric' parameter).
    99    """
    10     libm = cdll.LoadLibrary(find_library('m'))
     10   
     11    libm = cdll.LoadLibrary(find_library('m')) # for disabling/enabling floating point exceptions (division by zero occurs in the EMD library)
    1112    EPSILON = 0.0001
    12     def __init__(self, frams_module=None, density = 10, steps = 3, reduce=True, frequency=False, metric = 'emd', fixedZaxis=False, verbose=False):
     13   
     14    def __init__(self, frams_module=None, density = 10, resolution = 3, reduce_empty=True, frequency=False, metric = 'emd', fixedZaxis=False, verbose=False):
    1315        """ __init__
    1416        Args:
    15             density (int, optional): density of samplings for frams.ModelGeometry . Defaults to 10.
    16             steps (int, optional): How many steps are used for sampling the space of voxels,
    17                 The higher the value, the more accurate the sampling and the longer the calculations. Defaults to 3.
    18             reduce (bool, optional): If we should use reduction to remove blank samples. Defaults to True.
     17            density (int, optional): density of samplings for frams.ModelGeometry. Defaults to 10.
     18            resolution (int, optional): How many intervals are used in each dimension to partition surface samples of Models in the 3D space.
     19                The higher the value, the more detailed the comparison and the longer the calculations. Defaults to 3.
     20            reduce_empty (bool, optional): If we should use reduction to remove blank samples. Defaults to True.
    1921            frequency (bool, optional): If we should use frequency distribution. Defaults to False.
    2022            metric (string, optional): The distance metric that should be used ('emd', 'l1', or 'l2'). Defaults to 'emd'.
     
    2729
    2830        self.density = density
    29         self.steps = steps
     31        self.resolution = resolution
    3032        self.verbose = verbose
    31         self.reduce = reduce
     33        self.reduce_empty = reduce_empty
    3234        self.frequency = frequency
    3335        self.metric = metric
     
    7678        """
    7779        Args:
    78             array1 ([type]): array of size n with points representing firsts model
    79             array2 ([type]): array of size n with points representing second model
    80 
    81         Returns:
    82             np.array(np.array(,dtype=float)): distance matrix n x n
     80            array1 ([type]): array of size n with points representing the first Model
     81            array2 ([type]): array of size n with points representing the second Model
     82
     83        Returns:
     84            np.array(np.array(,dtype=float)): distance matrix n*n
    8385        """
    8486        n = len(array1)
     
    9092
    9193
    92     def reduceSignaturesFreq(self,s1,s2):
     94    def reduceEmptySignatures_Frequency(self,s1,s2):
    9395        """Removes samples from signatures if corresponding samples for both models have weight 0.
    9496        Args:
     
    109111
    110112
    111     def reduceSignaturesDens(self,s1,s2):
     113    def reduceEmptySignatures_Density(self,s1,s2):
    112114        """Removes samples from signatures if corresponding samples for both models have weight 0.
    113115        Args:
     
    130132
    131133
    132     def getSignatures(self,array,steps_all,step_all):
    133         """Generates signature for array representing model. Signature is composed of list of points [x,y,z] (float) and list of weights (int).
    134 
    135         Args:
    136             array (np.array(np.array(,dtype=float))): array with voxels representing model
    137             steps_all ([np.array(,dtype=float),np.array(,dtype=float),np.array(,dtype=float)]): lists with edges for each step for each axis in order x,y,z
    138             step_all ([float,float,float]): [size of step for x axis, size of step for y axis, size of step for y axis]
     134    def getSignatures(self,array,edges3,steps3):
     135        """Generates signature for array representing the Model. Signature is composed of list of points [x,y,z] (float) and list of weights (int).
     136
     137        Args:
     138            array (np.array(np.array(,dtype=float))): array with voxels representing the Model
     139            edges3 ([np.array(,dtype=float),np.array(,dtype=float),np.array(,dtype=float)]): lists with edges for each step for each axis in order x,y,z
     140            steps3 ([float,float,float]): [size of interval for x axis, size of interval for y axis, size of interval for y axis]
    139141
    140142        Returns (distribution):
     
    143145           signature np.array(,dtype=np.float64): returns signatuere np.array of coefficients
    144146        """
    145         x_steps,y_steps,z_steps = steps_all
    146         x_step,y_step,z_step=step_all
     147        edges_x,edges_y,edges_z = edges3
     148        step_x,step_y,step_z=steps3
    147149        feature_array = []
    148150        weight_array = []
    149         step_half_x = x_step/2
    150         step_half_y = y_step/2
    151         step_half_z = z_step/2
    152         for x in range(len(x_steps[:-1])):
    153             for y in range(len(y_steps[:-1])) :
    154                 for z in range(len(z_steps[:-1])):
    155                     rows=np.where((array[:,0]> x_steps[x]) &
    156                                   (array[:,0]<= x_steps[x+1]) &
    157                                   (array[:,1]> y_steps[y]) &
    158                                   (array[:,1]<= y_steps[y+1]) &
    159                                   (array[:,2]> z_steps[z]) &
    160                                   (array[:,2]<= z_steps[z+1]))
     151        step_x_half = step_x/2
     152        step_y_half = step_y/2
     153        step_z_half = step_z/2
     154        for x in range(len(edges_x[:-1])):
     155            for y in range(len(edges_y[:-1])) :
     156                for z in range(len(edges_z[:-1])):
     157                    rows=np.where((array[:,0]> edges_x[x]) &
     158                                  (array[:,0]<= edges_x[x+1]) &
     159                                  (array[:,1]> edges_y[y]) &
     160                                  (array[:,1]<= edges_y[y+1]) &
     161                                  (array[:,2]> edges_z[z]) &
     162                                  (array[:,2]<= edges_z[z+1]))
    161163                    if self.frequency:
    162164                        feature_array.append(len(array[rows]))
    163165                    else:
    164                         weight, point = self.calculateNeighberhood(array[rows],[x_steps[x]+step_half_x,y_steps[y]+step_half_y,z_steps[z]+step_half_z])
     166                        weight, point = self.calculateNeighberhood(array[rows],[edges_x[x]+step_x_half,edges_y[y]+step_y_half,edges_z[z]+step_z_half])
    165167                        feature_array.append(point)
    166168                        weight_array.append(weight)
     
    175177    def getSignaturesForPair(self,array1,array2):
    176178        """Generates signatures for given pair of models represented by array of voxels.
    177         We calculate space for given models by taking the extremas for each axis and dividing the space by the number of steps.
    178         This divided space generate us samples which contains points. Each sample will have new coordinates which are mean of all points from it and weight
    179         which equals to the number of points.
     179        We calculate space for given models by taking the extremas for each axis and dividing the space by the resolution.
     180        This divided space generate us samples which contains points. Each sample will have new coordinates which are mean of all points from it and weight which equals to the number of points.
    180181       
    181182        Args:
    182183            array1 (np.array(np.array(,dtype=float))): array with voxels representing model1
    183184            array2 (np.array(np.array(,dtype=float))): array with voxels representing model2
    184             steps (int, optional): How many steps is used for sampling space of voxels. Defaults to self.steps (3).
    185                
     185
    186186        Returns:
    187187            s1 ([np.array(,dtype=np.float64),np.array(,dtype=np.float64)]): [coordinates of samples, weights]
     
    196196        max_z = np.max([np.max(array1[:,2]),np.max(array2[:,2])])
    197197
    198         # We request self.steps+1 samples since we need self.steps intervals
    199         x_steps,x_step = np.linspace(min_x,max_x,self.steps+1,retstep=True)
    200         y_steps,y_step = np.linspace(min_y,max_y,self.steps+1,retstep=True)
    201         z_steps,z_step = np.linspace(min_z,max_z,self.steps+1,retstep=True)
     198        # We request self.resolution+1 samples since we need self.resolution intervals
     199        edges_x,step_x = np.linspace(min_x,max_x,self.resolution+1,retstep=True)
     200        edges_y,step_y = np.linspace(min_y,max_y,self.resolution+1,retstep=True)
     201        edges_z,step_z = np.linspace(min_z,max_z,self.resolution+1,retstep=True)
    202202       
    203         for intervals in (x_steps, y_steps, z_steps):  # EPSILON subtracted to deal with boundary voxels (one-sided open intervals and comparisons in loops in function getSignatures())
    204             intervals[0] -= self.EPSILON
    205 
    206         steps_all = (x_steps,y_steps,z_steps)
    207         step_all = (x_step,y_step,z_step)
     203        for edges in (edges_x, edges_y, edges_z):  # EPSILON subtracted to deal with boundary voxels (one-sided open intervals and comparisons in loops in function getSignatures())
     204            edges[0] -= self.EPSILON
     205
     206        edges3 = (edges_x,edges_y,edges_z)
     207        steps3 = (step_x,step_y,step_z)
    208208       
    209         s1 = self.getSignatures(array1,steps_all,step_all)
    210         s2 = self.getSignatures(array2,steps_all,step_all)   
     209        s1 = self.getSignatures(array1,edges3,steps3)
     210        s2 = self.getSignatures(array2,edges3,steps3)   
    211211       
    212212        return s1,s2
     
    217217
    218218        Args:
    219             geno (string): representation of model in one of the formats handled by frams http://www.framsticks.com/a/al_genotype.html
    220 
    221         Returns:
    222             np.array([np.array(,dtype=float)]: list of voxels representing model.
     219            geno (string): representation of Model in one of the formats supported by Framsticks, http://www.framsticks.com/a/al_genotype.html
     220
     221        Returns:
     222            np.array([np.array(,dtype=float)]: list of voxels representing the Model.
    223223        """
    224224        model = self.frams.Model.newFromString(geno)
     
    236236            voxels1 np.array([np.array(,dtype=float)]: list of voxels representing model1.
    237237            voxels2 np.array([np.array(,dtype=float)]: list of voxels representing model2.
    238             steps (int, optional): How many steps is used for sampling space of voxels. Defaults to self.steps (3).
    239238
    240239        Returns:
     
    250249            print("Base voxels fig1: ", numvox1, " fig2: ", numvox2)
    251250            print("After reduction voxels fig1: ", sum(s1[1]), " fig2: ", sum(s2[1]))
    252             raise ValueError("Bad signature!")
    253 
    254         reduce_fun = self.reduceSignaturesFreq if self.frequency else self.reduceSignaturesDens
    255         if self.reduce:
     251            raise RuntimeError("Bad signature!")
     252
     253        reduce_fun = self.reduceEmptySignatures_Frequency if self.frequency else self.reduceEmptySignatures_Density
     254        if self.reduce_empty:
    256255            s1, s2 = reduce_fun(s1,s2)
    257256
    258257            if not self.frequency:
    259258                if numvox1 != sum(s1[1]) or numvox2 != sum(s2[1]):
    260                     print("Voxel reduction didnt work properly")
     259                    print("Voxel reduction didn't work properly")
    261260                    print("Base voxels fig1: ", numvox1, " fig2: ", numvox2)
    262261                    print("After reduction voxels fig1: ", sum(s1[1]), " fig2: ", sum(s2[1]))
     262                    raise RuntimeError("Voxel reduction error!")
    263263       
    264264        if self.metric == 'l1':
     
    281281                dist_matrix = self.calculateDistanceMatrix(s1[0],s2[0])
    282282
    283             self.libm.fedisableexcept(0x04)  # allowing for operation divide by 0 because pyemd requiers it.
     283            self.libm.fedisableexcept(0x04)  # change default flag value - don't cause exceptions when dividing by 0 (pyemd does it)
    284284
    285285            if self.frequency:
     
    288288                out = emd(s1[1],s2[1],dist_matrix)
    289289
    290             self.libm.feclearexcept(0x04) # disabling operation divide by 0 because framsticks doesnt like it.
     290            self.libm.feclearexcept(0x04) # restoring default flag values...
    291291            self.libm.feenableexcept(0x04)
    292292
     
    300300        """Calculates EMD for a pair of genotypes.
    301301        Args:
    302             geno1 (string): representation of model1 in one of the formats handled by frams http://www.framsticks.com/a/al_genotype.html
    303             geno2 (string): representation of model2 in one of the formats handled by frams http://www.framsticks.com/a/al_genotype.html
    304             steps (int, optional): How many steps is used for sampling space of voxels. Defaults to self.steps (3).
     302            geno1 (string): representation of model1 in one of the formats supported by Framsticks, http://www.framsticks.com/a/al_genotype.html
     303            geno2 (string): representation of model2 in one of the formats supported by Framsticks, http://www.framsticks.com/a/al_genotype.html
    305304
    306305        Returns:
     
    314313
    315314        if self.verbose == True:
    316             print("Steps: ", self.steps)
     315            print("Intervals: ", self.resolution)
    317316            print("Geno1:\n",geno1)
    318317            print("Geno2:\n",geno2)
     
    325324        """
    326325        Args:
    327             listOfGeno ([string]): list of strings representing genotypes in one of the formats handled by frams http://www.framsticks.com/a/al_genotype.html
     326            listOfGeno ([string]): list of strings representing genotypes in one of the formats supported by Framsticks, http://www.framsticks.com/a/al_genotype.html
    328327
    329328        Returns:
Note: See TracChangeset for help on using the changeset viewer.