On April 23 of this year, I was accepted into the project. However, I have failed the second evaluations due to my main PR#2236 not being able to get compiled till date.

Anyway, I think and have also advised juniors in the past that contributing to Open Source libraries is largely beneficial and teaches one a lot(with or without the GSoC certificate).

Hence, have decided to keep on contributing and get my project finished unofficially. My main aim was more inclined towards getting involved with CERN and getting to contribute to one of the most amazing science experiments of our century.

]]>

Here is the link I submitted for the final report -> GSoC 2017 Report. The last three weeks I have been wrapping up the 3D use case, writing its test cases and documentation.

The only deliverables which remain :

- Computing all monomials given a max_degree for the 3D case.
- Handling implicit intersecting polygons.

I would have loved to at least have the first one merged before SoC deadline but unfortunately I have two tests and two lab sessions in the span of three days hence will have to implement after the 30th.

Let us discuss how close the above issues are to being resolved :

- For the 2D case it was simple to just generate a list of all monomials up to a given max degree. I wanted to do a similar thing for the 3D case, but was having problems with having to find out the correct indices for , and in the flat_list. But then I remembered a certain PR in SymPy which two SymPy members and me had worked on(PR#12490). It seems that creating a flat list out a given matrix is a compiler optimization, therefore it would be a lot of work with quite less benefit if a flat_list is attempted to be made(in the code itself) for the 3D case as well.

Therefore, a list of lists can be used for the 3D case which makes indexing for the partial derivatives much easier. After this all that needs to be done is to simply take up of the faces of the polygon and compute the left_integral of 1 over it and then that value would be re-used. This is partly implemented and should be a part of the module soon but unfortunately after the 30th. - As mentioned in the Report, the intersection algorithm needs to be replaced and the case for more than two intersecting sides needs to be dealt with. The first problem isn’t that hard to be dealt with but I haven’t thought about the second one yet.

GSoC has been a great learning experience and I look forward to porting this module to symengine after the loose ends in SymPy are tied up. Grateful to both my mentors Ondrej and Prof.Sukumar for their guidance.

]]>```
p1 = [(0, 1, 0), (1, 0, 0), (0, 0, 0)]
p2 = [([-1, 0, 0], 0), ([1, 1, 0], 1), ([0, -1, 0], 0)]
```

The code should be able to figure out what the points are and then pass on that list of points representation to the rest of the other functions. I should be done with this in a day or two. To finish up the work for GSoC I’ll get the PR on intersecting polygons sorted out. After that, remaining documentation will have to be written and requisite clean-up to be done.

]]>The good thing is that the major part of my work is complete. This week I worked on the 3D case. Here is the PR : #13082 . A minor limitation(minor from the perspective of fixing it) is that only constant expressions are supported. Another limitation is that the input has to be a list of the polygons constituting the faces of the 3D polytope. This should actually be a list of points in correct order and the algorithm should figure out the polygon from the input. Examples of such input are in Chin et. al(2015) .

I’ll finish it up by Saturday and then proceed to completing PR #12931 . That might extend to the first few days of next week as well.

Now, about the 3D case I initially thought about writing a 3-Polytope class first but I now realise that there is no need as such. The API can be kept simpler. I’ll update this post with a PR link for the 3D case quite soon.

]]>So, from the discussion in pull request #12931 :

1. Fleury’s algorithm is not required. A better approach to calculate area would be to simply flip the sides after crossing an intersection point and come back to initial direction of traversal once that point is encountered again. I have implemented this technique locally. It has worked fine for the following polygons but still seems to have some minor bugs which I’ll fix really soon. Once that’s done I’ll update the pull request.

```
>>> p1 = Polygon((0,-3), (-1,-2), (2,0), (-1,2), (0,3))
>>> p1.area
-13/3
>>> p2 = Polygon((0,0), (1,0), (0,1), (1,1))
>>> p2.area
1/2
```

All of this was because of the material here.

2. The intersections are currently being found by a slow technique which I designed. A much better one by the name of Bentley-Ottmann algorithm exists for the same purpose. Once I get the area working for all the test cases on the codegolf link, I’ll work on implementing this.

3. Christopher also suggested that we add a property called footprint_area, which would refer to area that a polygon covers(i.e. the hull of the polygon) and another property called hull which would refer to that very path. I was initially confusing this area with the area that depends on direction. Now I see that this footprint area isn’t the one normally referred to when talking about the area of a polygon and it’s actually the other one.

]]>The last issue with the 2D case was the case of implicit intersecting polygons not being handled properly.

So, I submitted another pull request to resolve that matter. Of course, it will probably require a considerable number of changes which reviewers will advise. This is the link to it –> #12931.

Now, I have to begin implementing the 3D case. When making the proposal I thought of writing an entire 3-Polytope module first, but now that would seem like overkill and it isn’t a simple task to do so. Of course, if such a need arises then I definitely will get down to the task and write one on similar lines as the Polygon module. Firstly, I’ll have to think about the input API for 3-Polytopes. Once I have some kind of idea, I’ll ask Ondrej for advice.

]]>Unfortunately the dynamic algorithm was much faster **only** on a regular hexagon. Why that particular figure ? The reason still eludes me.

But I did use the same idea to implement the multiple polynomial case. That is, if a list of polynomials and the maximum exponent(max x degree + max y degree) of a monomial is given as input we can calculate integrals of all the possible monomial terms and then store those values in a dictionary. Then those values can be referenced as many times as required by the polynomials. However, is this one time cost too much?

The cost is greatly reduced if we do not have to re-compute the integral of the gradient.

For the 2D case we need not compute the left integral(i.e. the one with ) since that translates to computing the function at that point and multiplying it with the distance. Also, the right integral(i.e. one with )need not be re-calculated since the integral of the terms are being calculated in the aforementioned order. Example : gradient_terms = . Once we calculate we can use that result again in the right integral for and and then chain follows.

Therefore, only one integral has to be calculated namely, .

So, why doesn’t this technique work always ? Maybe, I implemented it in a bad way for the first case. I’ll have to look at it again and see how to better it.

Apart from the timings, Ondrej said that the 2D API looks good enough(some changes may be suggested though). The 2D case is close to getting merged. I also have to start thinking about the 3D implementation. Of course, the immediate thought is to translate the 3D problem into a sum of the 2D ones, but is that the fast way to do it ? Can some other pre-processing techniques be used to simplify the problem whilst in 3D. I will have to think about it.

]]>Prof. Sukumar came up with the following optimization idea :

Consider equation 10 in Chin et. al . The integral of the inner product of the gradient of and need not be always re-calculated. This is because it might have been calculated before. Hence, the given polynomial can be broken up into a list of monomials with increasing degree. Over a certain facet, the integrals of the list of monomials can be calculated and stored for later use. Before calculation of a certain monomial we need to see if it’s gradient has already calculated and hence available for re-use.

Then again, there’s another way to implement this idea of re-use. Given the degree of the input polynomial we know all the types of monomials which can possibly exist. For example, for max_degree = 2 the monomials are : .

All the monomial integrals over all facets can be calculated and stored in a list of lists. This would be very useful for one specific use-case, that is when there are many different polynomials with max degree less than or equal to a certain global maximum to be integrated over a given polytope.

I’m still working on implementing the first one. That’s cause the worst case time(when no or very few dictionary terms are re-used) turns out to much more expensive than the straightforward method. Maybe computing the standard deviation among monomial degrees would be a good way to understand which technique to use. Here is an example to show what I mean. If we have a monomial list : , then the dynamic programming technique becomes useless here since gradient of any term will have degree 2 and not belong in the list. Now, the corresponding list of degrees is : , and standard deviation is zero. Hence the normal technique is applicable here.

In reality, I’ll probably have to use another measure to make the judgement of dynamic vs. normal. Currently, I’m trying to fix a bug in the implementation of the first technique. After that, I’ll try out the second one.

]]>Firstly, there were some minor issues :

- I had used python floating point numbers instead of SymPy’s exact representation in numerous places(both in the algorithm and the test file). So, that had to be changed first.
- The decompose() method discarded all constant terms in a polynomial. Now, the constant value is assigned a value of zero as key.

Example:

Before:decompose(x**2 + x + 2) = {1: x, 2: x**2}

After:

decompose(x**2 + x + 2) = {0: 2, 1: x, 2: x**2}

- Instead of computing component-wise and passing that value to integration_reduction(), the inner product is computed directly and then passed on. This leads to only one recursive call instead of two for 2D case and future three for 3D case.

Prof. Sukumar also suggested that I should add the option of hyperplane representation. This was simple to do as well. All I did was compute intersection of the hyperplanes(lines as of now) to get the vertices. In case of vertex representation the hyperplane parameters would have to be computed.

**Major Issues:**

- Another suggestion was to add the tests for 2D polytopes mentioned in the paper(Page 10). The two tests which failed were polytopes with intersecting sides. In fact, this was the error which I got :

sympy.geometry.exceptions.GeometryError: Polygon has intersecting sides.

It seems that the existing polygon class in SymPy does not account for polygons with intersecting sides. At first I thought maybe polygons are not supposed to have intersecting sides by geometrical definition but that was not true. I’ll have to discuss how to circumvent this problem with my mentor.

2. Prof. Sukumar correctly questioned the use of best_origin(). As explained in an earlier post, best_origin() finds that point on the facet which will lead to an inner product with lesser degree. But, obviously there is an associated cost with computing that intersection point. So, I wrote some really basic code to test out the current best_origin() versus simply choosing the first vertex of the facet.

from __future__ import print_function, division from time import time import matplotlib.pyplot as plt from sympy import sqrt from sympy.core import S from sympy.integrals.intpoly import (is_vertex, intersection, norm, decompose, best_origin, hyperplane_parameters, integration_reduction, polytope_integrate, polytope_integrate_simple) from sympy.geometry.line import Segment2D from sympy.geometry.polygon import Polygon from sympy.geometry.point import Point from sympy.abc import x, y MAX_DEGREE = 10 def generate_polynomial(degree, max_diff): poly = 0 if max_diff % 2 == 0: degree += 1 for i in range((degree - max_diff)//2, (degree + max_diff)//2): if max_diff % 2 == 0: poly += x**i*y**(degree - i - 1) + y**i*x**(degree - i - 1) else: poly += x**i*y**(degree - i) + y**i*x**(degree - i) return poly times = {} times_simple = {} for max_diff in range(1, 11): times[max_diff] = 0 for max_diff in range(1, 11): times_simple[max_diff] = 0 def test_timings(degree): hexagon = Polygon(Point(0, 0), Point(-sqrt(3) / 2, S(1) / 2), Point(-sqrt(3) / 2, 3 / 2), Point(0, 2), Point(sqrt(3) / 2, 3 / 2), Point(sqrt(3) / 2, S(1) / 2)) square = Polygon(Point(-1,-1), Point(-1,1), Point(1,1), Point(1,-1)) for max_diff in range(1, degree): poly = generate_polynomial(degree, max_diff) t1 = time() polytope_integrate(square, poly) times[max_diff] += time() - t1 t2 = time() polytope_integrate_simple(square, poly) times_simple[max_diff] += time() - t2 return times for i in range(1, MAX_DEGREE + 2): test_timings(i) plt.plot(list(times.keys()), list(times.values()), 'b-', label="Best origin") plt.plot(list(times_simple.keys()), list(times_simple.values()), 'r-', label="First point") plt.show()

The following figures are that of computation time vs. maximum difference in exponents of x and y. Blue line is when best_origin() is used. Red line is when simply the first vertex of the facet(line-segment) is selected.

Hexagon

Square

When the polygon has a lot of facets which intersect axes thereby making it an obvious choice to select that intersection point as best origin, then the current best origin technique works better as in the case with the square where all four sides intersected the axes.

However, in case of the hexagon the best_origin technique would result in a better point than the first vertex only for one facet. The added computation time makes it more expensive than just selecting the first vertex. Of course, as the difference between exponents increase then the time taken by best_origin is overshadowed by another processes in the algorithm. I’ll need to look at the method again and see if there are preliminary checks which can be performed making computing intersections a last resort.