After computing the homography, I was able to rectify pictures as below :)
Here are the Harris corners in the city of Lyon, after adaptive non-maximal suppression (with 400 points as max limit):
AFter extracting 8x8 feature descriptors for each "corner", I got bad results when I tried to match them. So I sampled pixels at a distance of s=4, and got great results. See below the pairs of points detected. There are still some outliers, but that's OK for now
And then I tried using RANSAC to select only the best points, but I had some problems with it.